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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 
lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directiy" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 
"indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, ftsr 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
polynucleotidesand cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequencesaredesignatedasSEQIDNO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 
1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
1 5 specific domain or truncation of the peptides encoded by SEQ ID NO: 1 -1 786 and 3573-53 58 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO: 1 -1 786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO: 1 -1 786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO: 1-1 786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
3 0 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ED NO : 1 - 1 786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art In a particularly preferred embodiment, the nucleic 
acidsequencesofSEQIDNO:l-1786and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrathet al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1 -1 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO.i -1786 and 3573-5358. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
equivalents" thereof (eg., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemicaUy synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of anmRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
1 0 invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with {e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3*-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GS Cs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ ceils for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et aL (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 

1 0 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1 989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1 989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

15 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the. human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully - 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
.with a single mismatch is calculated by multiplying the probability for a full match (1 --4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terras "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
1 0 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
15 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 

protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 



9 



WO 01/53312 PCT/US00/34263 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant B (or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in deterrnining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutarnine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host ceUs chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
1 0 macromolecules, e.g. , polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 
1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 
art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (/.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment; 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers.to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:I- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide, which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in imraunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 



14 



WO 01/5331 2 PCT/USOO/34263 
The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
. using any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-53 58 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1- 1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO: 1-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORPs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F.J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

1 5 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1 982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al. 3 Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sarnbrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-1 786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukary otic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may farther comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 

pBs KS, pNH8a, pNH16a, pNHlSa, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, 

pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., 

Nucleic Acids Res. 1 9, 4485-4490 (1991), in order to produce the protein recombinantly. Many 

suitable expression control sequences are known in the art. General methods of expressing 

recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g. , the ampicillin resistance gene of £ coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 



WO 01/53312 PCTAJS00/34263 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasraids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

10 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, die selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

15 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO:l-1786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-1 786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 1 0, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
mosine,N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5*-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 

described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

an antisense nucleic acid molecule that binds to DMA duplexes, through specific interactions in 

the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

15 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucIeotide (Inoue et al, 
(1987) Nucleic Acids Res 1 5: 613 1-6148) or a chimeric RNA -DNA analogue (Inoue et al, (1 987) 

25 FEBS Lett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 
35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. 
No. 4,987,071 ; and Cech et al U.S. Pat No. 5,11 6,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g., Bartel et al, (1993) Science 261 ;141 1-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. etal (1992) Ann. N.Y. Acad Sci. 660:27-36; and 
1 0 Maher (1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
15 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or n PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et al (1 996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al. (1 996), above; Perry-O'Keefe (1 996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5'-(4-methoxytrityl)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Chem 
Lett 5:1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
ceil membrane (see, e.g., Letsinger et al, 1 989, Proc. Natl Acad. ScL U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the ■ 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 

24 



W ° 01/S3312 PCT/US00/34263 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 
1 0 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et ah, Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
1 5 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORE) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
20 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters: Cell-free translation systems can also be employed to produce such proteins using 
PvNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
30 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the C 1 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
35 from in vitro culture of primary tissue, primary explants, HeLa. cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
20 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et aJ.; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID « 
NO: 1-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1 787-3572 and 5359-7 1 44 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO: 1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at. 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et a!., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 

15 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the matur e form ' 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 

25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 

30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

1 0 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces, at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a fall length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterised 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

f Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed/' 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavaiin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 21 5:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al.,. J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the terra "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g, cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of -the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g. , a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the eel 1. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publication No. WO 94/12650, PCT International PublicationNo. WO 92/20808, and PCT 
10 International PublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
20 replace a gene' s existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
5 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
) under the control of the ne w regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl- transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,07 1 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwinet al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International Application No. PCT/US90/06436 
(WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
. through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as mode! systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
20 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic anirhals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labekd) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
10 Humans); Takai et ah, J. Immunol. 137:3494-3500, 1 986; Bertagnolli et al, J. Immunol. 

145:1706-1712, 1990; Bertagnolli etal., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal Tceil stimulation, 
15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., J)avis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1 857-1 861, 1 986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
19S0; Weinberger et al, Eur. J. Immun. 1 1:405-41 1, 1981; Takai et al, J. Immunol. 
5 137:3494-3500, 1986; Takai et al, J. Immunol. 140:508-512, 1988. 



4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of piuripotent and totipotent stem 
10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 
25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotentiaiypluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al, J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or exrvivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et ah, 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter; M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as'in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

10 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above* from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SOD)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g. , anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1 5 1998), skin prick test (Hoffmann et al. Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al. Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al, 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
20 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
5 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulaiting or preventing one or more antigen functions (including without 
> limitation B lymphocyte antigen functions (such as, for example, B7)), e.g, preventing high 
level lymphokine synthesis by activated T ceils, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789^792 (1992) and Turka et al, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/]pr/lpr mice or N2B hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and P2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class H associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., I 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewslri, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
Mond, J. J. and Brunswick, ML In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1 986; Takai et al, J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatoniaet al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 1 82 :255-260, 1 995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating.hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al, Endocrinology 91 :562-572, 1972; Ling et al, Nature 321 :779-782, 1986; Vale et at, Nature 
20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemptactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous ceil carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi' s sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Ammoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Cannustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactmomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

15 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boy den Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1 999) and Li et al., 

25 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIG AND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist require the use of other proteins as compering ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietaiy synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit 5 * (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a. plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 410.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti- inflamm atory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

1 5 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 

4.10.16 LETJKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes 2X)Ster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington f s chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

3 0 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukpencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

1 0 forth in Arakawa et al. (1990, J. Neurosci. 1 0:3507-35 1 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 ( assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 

30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolisra, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 

nutritional factors or components); effecting behavioral characteristics, including, without 

limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 

5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 

than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 

deficiencies of the enzyme and treating deficiency-related diseases; treatment of 

hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 

as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 

10 in a vaccine composition to raise an immune response against such protein or another material or 

entity which is cross-reactive with such protein. 



4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 



4.1020 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
1 5 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01ng/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0.1^ig/kg to 10 mg/kg of patient body weight For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in the art 
1 0 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art 



15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceuticaUy acceptable" means a non-toxic material that does not interfere with the 

15 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, EL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

0 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), reforming growth factors (TGF-cc and TGF-0), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or antithrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factors), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

1 0 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician-to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutical^/. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

1 5 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention wiil be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
banier to be permeated are used in the formulation. Such penetrants are generally known in the 
art 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium' 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbondioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g, in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
5 herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patients response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ug to about 100 mg (preferably about 0.1 ug to about 10 mg, more preferably 
about 0.1 ug to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-fiee, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ct and TGF-P), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (z.e?., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED 50 - Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g*., Fingl et al., 1975, in "The 
10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
1 5 bioassay s can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 \xg/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 fig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the maimer of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F^, Fab» and 

10 fragments, and an F^ expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal . 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g-., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation- See, e.g., 
Hopp and Woods, 1981, Proc. Nat. Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
MoL Biol. 157:105-1 42, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, » 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies- 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J, Immunol., 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked irnxnunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,8 1 6,567; Morrison, Nature 368, 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-irnmunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , F(ab*)2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al,, 
Nature, 321 :522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially ail of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. On. Struct. Biol 
2:593-596 (1992)). 

5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et aL, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et aL, 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
including phage display libraries (Hoogenboom and Winter, J. MoL BioL. 227:381 (1991); 
Marks et aL, J. MoL BioL, 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 1 3 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the foil complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins: The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker, 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13-4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b*)2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F( ab -)2 fragment; (iii) an F a b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5,13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker etaU 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
ah, Methods in Enzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab , ) 2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab* fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kosteiny etal., J. Immunol. 148(5): 1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the. production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and V L domains of one fragment are forced to pair with the complementary V L and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two dilferent epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-anti genie arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRm (CD1 6) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding aim and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (IT). 

5.13.6 Heterocon jugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1 191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

83 



10 



WO 01/53312 PCT/US00/34263 
Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordu proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, cretin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, ,3, I, ,3l In, 9,> Y,and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(pKtiazoniumbenzoyl>ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled I-isothiocyanatobenzyI-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent. 
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4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention ca 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 
software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et a!., J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORPs may 
be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et ah, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RN A transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems- 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 



4 .16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice ' 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiendy transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutical^ acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l - 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1 992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee etal., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived.from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,1 88 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both, the 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved usingpassive adsorption (Inouye& Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagatae/ al, 1985;Dahlene/a/., 1987; Morrissey& Collins, (1989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a linker. For example, Broude^ al (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Co valinkNH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussenef al, (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
CovaLink NH secondary amino groups that are positioned at the end of spacer aims covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 10 mm. at 95°C and cooling on ice for lOmin. Ice-cold 0.1 M 1-methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide0.2 M l-emyl-3-(3-dime%lammopropyl)^arbodiimide(EDC), dissolved in 
10 mM 1-Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 025% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 
described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiesterlink to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides direcdy on a glass surface, as described by 
Fodor etal. (1991 ) Science 25 1(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by VanNessera/. (1991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generatedsynthesis described by Pease et al. t (1994) PNAS USA 91(1 1) 5022-6, incoiporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotideprobes in high-density, miniaturized arrays, utilize photolabile 
S^protected Macyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

421 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3, plasmid or lambda vectore and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20( 1 4) 375 3 -62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CwJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (CwJI**), yield a quasi-random distribution of DNA fragments form the small 
raoleculepUC19(2688basepairs). Fitzgerald etal. (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI** digest of pUC19 that was size 
fractionated by a rapid gel filtration method and directly ugated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysisof76 clones showed that CviJI** restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

frrespectiveof the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art 

422 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be framed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarraysmay represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
10 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specificationare hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequ ences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5' sequence ofthe amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
AmphficationofcDNA Ends) was performed to further extend the sequence in the 5' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(Le., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. . 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fasta. bioch.virginia.edu) which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was - 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrapand Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 

1 0 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 

25 UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al I9MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 • examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI :1 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 5.3.2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 

UniGene version 1 17, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1 414-1 652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
15 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5A2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1653-1 745. 
Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 
The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
1 9MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The results showed 
homologues for SEQ ID NO:1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a foil length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 9, gb pri 1 1 9, 
5 UniGene version 1 1 9, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1 768. 
Table 1 shows the various tissue sources of SEQ ID NO : 1746-1768. 
1 0 The homology for SEQ ID NO: 1 746-1 768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the positions) of the signature within the polypeptide sequence. 
20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. Amaximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Ariris 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e, dbEST version 120, gb pri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
10 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ED NOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 769-1 786. 
The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1 769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
20 Biol., Vol. 6 pp. 2 19-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1 ) 
25 pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin - 



adult brain 



RNA Source 



GIBCO 



Hyseq 
Library Name 



AB3001 



SEC ID NDS: 



9 19-21 50-51 65-56 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 290-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 



633-634 645-645 
669-671 678 687 



608-609 613 618 
652 657-658 660 

695 697 .710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-909 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1253 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1445 1468 1482 1493- 
1494 1501-1503 1506-1507 
1517 1522-1524 1530-1533 
1549 1565 1578 1598 1606 
1623 1625 1627 1639 1643 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



1512 
1537 
1608 
1648- 



GIBCO 



ABD003. 



3 12-14 18-19.25 30-31 34-36 43~ 
45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 13S-137 
139 142 146 148-149 152 154 157 
1S9 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-588 593 59S 597 
601 606-609 616-620 622-623 62S 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 76S-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 332-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue Origin j RKA Soured" 



Hyaeq 
Library Name 



SSQ ID NOS:" 



adult brain Clontech 



1097 1103 
1117 1119 
1134 1144 
1158 1167 
1190 1193 
1217 1220 
1241 1243 
1267 1269 
1289 1293 
1316-1320 
1344 1348 
1374 1377 
1394 1400 
1425-1427 
1456 1458- 
1478 1482- 
1497 1499 
1522-1524 
1548-1550 
1565 1567 
1591 1593 
1611 1620- 
1630-1632 
1645 1647 
1664 1667 
1686 1690 
1711 1719 
1731-1733 
1747 1749 
1761 1765 



1107 
1121 
1145 
1170 
-1194 
1226 
1247 
1279 
-1294 
1326 
1351 
1380 
1409 
1437 
-1459 
-1483 
1506 
1530- 
1552 
1569 
1595 
1621 
1636 
1649 
1669 
1594- 
1722- 
1738 
1753 
1771 



1109 
1124 
1149 
1173 
1200 
1227 
1252 
1281 
1306 
1333 
1355- 
1386 
1414 
1443 
1468 
1487- 
1508- 
-1533 
1557- 
1571 
1S98- 
1624- 
1640- 
1653- 
1673 
1696 
1723 
1740 
1757 
1785 



1112 
1127 
1151 
1184 
1202 
1229 
1258 
1284 
-1307 
1338 
-1357 
1389 
1422 
1446 
1470- 
-1488 
-1511 
1545- 
■1559- 
1586 
1601 
•1626 
•1641 
1655 
1678 r 
1701 
1726- 
1743- 
1758 



1116- 

1130 

1157- 

1188 

1215- 

1231 

1263 

1286- 

1312 

1341 

1368 
-1390 
-1423 

1454 
■1472 

1493 

1517 
•1546 

1563 

1588 

1603 

1628 

1644- 

1657 

1681 

1709 

1727 

1744 

1760- 



ABR001 



adult brain Clontech 



29 63-59 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 4S4 469 481 490 
506 517 586 597 631 641 659 691 
715 799 003 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



ABR00 6 



adult brain Clontech 



5-8 15-16 168 212-213 271 278 

280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1751 



ABR0Q8 



10 13-19 22-23 25 29 33 37-39 " 
43-45 50-51 54-55 57-53 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Hame 



SZQ ID NOS: 



208 210 214-215 218 
231-232 234-241 245 
255 257-259 268-269 
285-286 288 290-292 
307 309-311 313 315 
322 325-326 328 330 
341 344-347 349 352 
362 369-373 376 379 
387 390-391 393-394 
405-411 414-415 417. 
437-438 440-444 4S3- 
467 469-471 476 478 
491 497 503 506-513 
524-526 528-530 532- 
542 544 547-551 553 
572-574 577 581 585 
591 597 599 601-602 
615-617 619-620 622- 

631 633-634 636-641 
651-653 65S-664 669- 

632 687 689 691-700 
715-717 720-721 725- 
742-743 746 750-752 
762-764 766 768 773 
734-785 787-789 794 
803 805 811 614-815 
834-837 839-840 842 
861-862 865 867-872 
883-884.887 889-892 
898 901 904 908 910 
919 921-924 926-927 
941 943 945 949 953 
963 967 969 971 975 
986 988-990 992 997 
1004-1006 1008 1012 
1027 1029-1031 1035- 
1048 1053 1057 1059 
1070 1072-1075 1077 
1085-1093 1095-1096 
1114-1125 1127 1131- 
1138 1142-1145 1148- 
1163 1167 1169 1172 
1180 1183-1188 1191- 
1200 1204 1206 1211 
1222-1223 1226-1227 
1234-1235 1241-1242 
1266 1269-1271 1276- 
1281 1284-1286 1292 
1299 1305-1309 1312 
1319 1322 1324-1327 
1334-1335 1339 1344 
1354-1355 1357-1358 
1369-1370 1373-1374 
1381-1384 1386-1388 
1396-1397 1400 1403 
1414 1419-1420 1423 
1435 1437-1438 1440 
1448 1453-1455 1457 
1464 1466 1468 1471 
1482-1463 1496 1502- 
1509 1513 1519-1520 
1536 1547 1549-1552 
1574 1578 1586-1589 
1601-1602 1605 1607- 
1617 1619-1621 1623 
1635-1641 1643-1645 
16S3 1656-1658 1664 
1674 1676-1684 1686 
1694-1696 1704-1705 



221-226 229 
-247 251-253 
271 276-281 
300-302 304 
317-318 320 
-331 333-338 
354 356-357 
-380 382 384 
397 399-403 
-420 426-428 
-455 462 464 
492-484 488- 
516-517 520 
•534 537-540 
561 S65-567 
587-588 590- 
606-610 612 
623 628-629 
643 645-647 
671 673 679 
702 706 710 
734 736-739 
756 758-759 
-778 780-782 
796 799 802- 
818 625-826 
-843 856-859 
874-875 881 
894-895 897- 
912 914 917 
930-932 935- 
-954 958 961- 
977 981-983 
999-1002 
1018-1023 
-1037 1047- 
1063 1068 
1081-1083 
1108-1112 
•1133 1135- 
1158 1160- 
1175 1177 
1195 1199- 
1213-1216 
1229-1231 
1244-1263 
1277 1279- 
1294-1295 
1314 1316- 
1330 1332 
-1346 1351 
1365-1367 
1376-1379 
1392 13 94 
-1407 1410 
1432-1433 
-1442 1445 
1461 1463- 
1477 1480 
•1504 1507- 
1524-1526 
1567 1573- 
1597-1598 
•1609 1611- 
162S-1626 
1649 1651 
1659 1671- 
1639-1690 
1708-1709 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult brain 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-17S1 1765 1767 1771- 
1772 1776-1777 1779-1730 1786 



Clontech 



ABR011 



lidult brain 



24 75 103 186 210 310-31Z 364"^ 
365 508 623 710 937 1002-1003 
1059 1204 1S09 1731-1732 



BioChain 



ABR012 



adult brain 



46 132-164 204-205 300 739 767 
1371 1S49 1620 1684 



adult brain 



Invitrogen 
Invitrogen 



ABR013 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



ABR014 



^dult brain 



Invitrogen 



ABR015 



adult, brain 
adult brain 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
17 22 173 8 

419 434-43S 441-442 763 789 983 
1320 



Invitrogen 



ABR016 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



Invitrogen 



ABT004 



14-16 22-23 25 37-39 43 58 £0 

70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 1S8 210 218 229 259 267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 7S9 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
951 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
1170 1174 1188 1193-1194 1205 
1225 1229 1231 1254 1258 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1S36 1547 
1554 1557-1559 1551-1562 1567 
15B5 1588 1590 1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



cultured 
preadipocytes 



Strategene 



ADP001 



5-8 11 17 25 68-69 
105 110 116 136-138 
189 196-198 261 267 
301 318 331 336-338 
400 428 430-431 510 
527 549 S57 561 602 
631 637 647 670 681 
748 782 793-794 817 
845 858-859 879 882 
960 982 986 995-995 
1005-1007 1025 1027 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
168 171 188- 
276 288 293 
379-380 391 
-512 520 524 
618 620 622 
-682 710 731 
834-836 843 
893-895 934 
1000 1002 
-1028 1032 
1097 1099- 
1219-1220 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adrenal gland 



1260 

1322 

1370- 

1437 

1602 

1660 

1711 

1760- 



1271 

1329 

1371 

1466 

1608 

1662 

1719- 

1761 



1297- 

1339 

13 98 

1468 

1614 

1673 

1720 

1765 



1298 

1345 

1403 

1533 

1631 

1687- 

1742 

1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



1320 
1366 
1431 
1594 
1650 
1695 
1749 
1785 



Clontech 



ADR002 



adult heart 



4-10 15-16 25 29-31 43-4$ 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 11B 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 251 255 267-269 271 280- 
281 285 295 298 311 336-338' 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
B69 87S 883 898 904 912 S22-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-113S 1151 1158 1163 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1295 1290 1293 1307 1324- 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1546 1567 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



"GIBCO 



AHR001 



4-8 10-11 15-16 
46 50-52 57-58 
85 87 89 94 97 
110 112 114 116 
127 130-132 134 
147-151 153 163 
186 192 195 197 
215 220 225-226 
236 251 257-260 
277 280-232 28S 
298-301 304 307 
325 330 333 336 
352 354 358 361 
384 387-398 391 
408-409 411-412 
433-439 445-446 
457 459 462 469 
483-484 487-490 
503 506 508 510- 
526 534 536-540 
560-562 574-577 
587 589 593 595 
612 615-620 622- 
645-652 656-660 
674-675 683-684 
701 709 712 715 



18-21 34-39 44- " 
60 62-63 71 75 82 
100 103-104 108- 
118-119 122-123 
136-138 141-144 
164 168-171 179 
199 204-205 212- 
229-230. 232 234- 
262 265 272 274 
-286 289-292 296 
309 314 321 324- 
338 345 349 351- 
368 370 380 383- 
393 397 401 406 
414-416 430-431 
449 452 4S4-45S 
472-473 476-480 
492-493 496-498 
513 516 519-522 
542 546 549 553 
581-582 584 586- 
597 604-609 611- 
623 626 632 637 
665-666 670-672 
6B7 692-694 697 
716 719-720 725- 
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Tissue Origin 



SNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult kidney 



GIBCO 



AKD001 



726 728 730-732 735 738-739 743- 
744 746 751 753 759 761 7S5 770- 
77X 775-780 785 783-790 796 802 
804 810 812 817 821 826 828 830 
637 843 845-847 849-853 857-861 
863-864 869 871 875 877-879 881 
883 887 890-892 894-895 897-898 
SOI 903 906-907 911-913 915 919 
921-925 927-928 933-935 945 958 
961-963 967 969-972 975 977-978 
980-986 990 992 999-1002 1005- 
1007 1010 1016 1019-1020 1022- ' 
1023 1025 1028-1037 1039-1040 
1043 1047 1050 1054-1055 1057 
1059 1063-1064 1067-1068 1070 
1072 1075-1076 1083 1085-1087 
1089 1093-1094 1104 1106 1108- 
1109 1113 1116-1117 1119 1121 
1124 1126 1128 1131-1134 1144- 
1145 1148-1149 1151 1158 1167 
1169-1170 1175 1177 1192 1196 
1199-1200 1202 1206-1208 1211 
1216 1218 1222 1227-1229 1232- 
1235 1238-1241 1243-1244 1247- 
1248 1250 1253-1254 1256-1258 
1261 1268 1270-1271 1277 1280- 
1282 1287 1292 1298-1299 1306 
1308 1317-1321 1324-1325 1330 
1332 1334-1337 1339 1344-1345 
1349-1350 1354-1356 1359-1360 
1365-1366 1369 1371 1374-1375 
1378-1380 1383-1384 1389 1397 
1400 1403 1409 1417 1423-1426 
1437 1439 1442 1444 1446-1447 
1450 1453 1468 1470 1473 1479 
1481 1488 1490 1501-1504 1519 
1521 1524 1528 1530-1534 1536- 
1537 1539 1541-1542 1547 1553 
1555 1560 1565 1567-1571 1588 
1591 1597-1598 1601-1602 1605 
1614-1616 1619-1620 1623-1628 
1630-1632 1634 1636 1641 1644- 
1645 1647 1649 1652-1655 16S9 
1662 1667 1673-1674 1680-1681 
1684 1686-1688 1704-1705 1709 
1711-1712 1717 1724 1726-1727 
1731-1733 1737-1738 1741 1743- 
1744 1749 1754-1755 1760-1761 
1765 1772 1785 



4-8 10-11 17-21 29-31 35 1 39 42- — 
45 50-51 56-58 60-61 64 68-69 75 
77 80 82 35 87 92-94 97 100 102- 
104 107-108 112 116-117 119 123 
127-133 136-137 139-141 143-144 
147-154 157 161-163 16S-166 169 
172 176 178-179 192 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 26S-269 271- 
272 274 276-277 279-281 234-286 
290 293 29S 298-299 301-302 304 
307 311-313 321 325-326 329-331 
333 341 344 348-350 352 356 358- 
359 362 364-365 368 370-372 374 
376-377 380-382 392 395 398 400- 
401 404 407-409 414-415 423-424 
430-437 443-444 446 449 451 453- 
455 459 461-462 464 467 469 471- 
474 476-477 480-481 483 487-488 



no 



WO 01/53312 PCT/US00/34263 



Tissue Origin 


RHA Source 


Hvseo 






SEQ 


ID KOS: 








Library Naoe 




















490- 


491 4 


93 497-505 


cm. 

31U- 


en ck 

3lJ 31b- 








520 


522 524 526-529 












544 


547 549 554-556 


560 


562 564 








5S7 


571-576 578 582 


586- 


589 592- 








593 


598-599 601 604 


-606 


D \j O — OX J 








615- 


619 6 


21-62 


6 532 


-634 


637-643 








645- 


652 6 


55 660-664 


669- 


672 676 








678- 


679 688 692-695 


698 


702 711 








713 


717 7 


19-720 727 


/ ji 


735—736 








738 


743 7 


45-746 751 


7d 


755 762- 








763 


765 771-773 775 


770 
— r to 


/ O U / O D 








788 


793 795-796 800 


O Uj 


one on ft 








810- 


812 814-819 821 


O £0 










834- 


838 842-645 848 


gee 
- ODD 


OD / - 001. 








354- 


865 8S7 869 871 


O ft 


ft7<C ooi 








836- 


887 889-891 893 


- a jo 


oqo on n 
030 - 3U U 








902 


906-908 910-914 


918 










925- 


927 929-93 


5 937 


940- 


QA1 OA a 








948- 


949 951 953-958 


960- 


OCT QC1 








964 


969-9 


70 972 976 


-978 


70« - Sob 










990 9 


92-993 995 


-997 


333- 1002 










-1008 


1010 


1012 


-1013 


1016- 








1017 


1019 


-1020 


1022 


1025 


-1031 








1035 


1038 


-1040 


1042 


1044 


1047 








1050 


1054 


-1055 


1057 


-1064 


1068 








1070 


-1073 


1078 


1085 


-1086 


1088- 








1089 


1092 


1094 


1097 


1099 


-1102 








1107 


1109 


-1112 


1116 


-1119 


1121 








1123 


-1125 


1132 


-1135 


1140 


1142- 








1143 


1146 


-1147 


1149 


-1150 


1153- 








1154 


1157 


1159 


1163 


1167 


1170 








1178 


-1179 


1181 


1183 


1192 


1196- 








1200 


1202 


-1204 


1206 


-1211 


1216- 








1219 


1221 


-1222 


1225 


1227 


-123 0 








1232 


-1234 


123 8 


-1241 


1243 


-1244 








1246 


-1247 


1253 


1257 


-1258 


1260- 








1261 


1267 


-1268 


1270 


1272 


-1274 








1281 


1283 


1287 


-1289 


1293 


-1295 








1299 


1306 


1308 


1311- 


-1313 


1317- 








1320 


1323 


1329 


-1330 


1334 


-1335 








1339 


1341 


1349 


-1350 


1353 


-lib / 








1359 


1367 


1369 


1373 


1375 


1378- 








1379 


1394 


1397 


1400 


1403 


1405 








1407 


-1409 


1417 


1419 


1423 










1428-1431 


1433 


1437-1438 


1 A AO. 








1443 


1445- 


•1446 


1448- 


•1450 


1453- 








1454 


1459 


1461 


1465- 


1468 


1474- 








1475 


1478 


1484- 


•1438 


1490 


"1 A OO _ 








14 93 


1495 


1497- 


1498 


1506 










1509 


1512 


1518 


1521- 


1522 


1525 








1527- 


-1528 


1532- 


1533 


1537 


154 0- 








1541 


1547- 


1550 


1552 


1556 


-1559 








1561 


1565- 


1566 


1568 


1571 


1575 








1578-1579 


1583 


1586- 


1587 


1589 








1591- 


•1592 


1594 


1598 


1600 


1603- 








1604 


1606 


1608 


1611 


1613 


1615- 








1616 


1618- 


1622 


1624- 


1628 


1631- 








1632 


1634- 


1636 


1638- 


1639 


1641 








1644 


1646- 


1649 


1653- 


1656 


1662 








1664 


1666- 


1667 


1670- 


1671 


1676- 








1679 


1683- 


1684 


1686 


1691-1692 








1696- 


1699 


1701 


1709- 


1711 


1713- 








1714 


1716- 


1719 


1723- 


1724 


1726- 








1727 


1733 


1737- 


1738 


1741 


1743- 








1744 


1748- 


1749 


1751 


1760- 


1761 








1763- 


1768 


1778 


1780 


1785 




adult kidney 


Invitrogen 


" AKT002 


20-21 


37-3 


9 47 


52 57 


60 6 


5-66 








68-69 


80 104 107-108 


122 


130 133 








136-137 140 142 


-143 


149 169 174 
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Tissue Origin 



adult lung 



RNA Source 



GIBCO 



Hyseq 
Library Name 



SEQ ID NOS: 



430 
472 437-488 



181 197 227-228 235-23$ 244 STT 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382 
383 387 392 401 414 416 421 
443 445 449 453-454 
504 S06 513 S16 519 522 528 536 
540 546 554 S85 587 594 598 602 
607 616-617 626-627 636 643 662 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837 
838 849-650 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 108S 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1254 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1365 1378-1379 1403 1414 
1419 1428-1429 1436 1446 
1463-1464 1467-1468 1470 
1478 1486 1491 1509 1519 
1529 1534 1547 1596 1600 
1623 1629 1631 1634 1638 
1647 1652 1660 1664 1667 1569 
1670 1573 1686 1709 1727 1740 
1776 



1458 

1477- 

1527 

1619 

1643 



ALG001 



4-8 14 37-39 44-46 
63 75 82 88 93 103- 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-540 544 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 
10S4 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1364 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 16C6 
1627-1629 1632 1642 
1569 1676-1677 1684 
17311732 1737-1738 
1786 



50-51 56 62- 
104 113 125 
154 157 162 
191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
-477 488 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
837-838 845 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1522 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1748-1749 



lymph node 



CI on tech 



ALN001 



24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481- 
482 503 526 529 537-540 546-547 
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RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



621 626 b49 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738" 
838 844 357- 
928 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1S31 1641 
1691-1692 



ALV001- 



ALV002 



5-8 11 20-21 46 bO-51 58 65-66 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1C89 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1233 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1565 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 16S4-1655 16S2 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 17S8 

1760-1761 1 763-1765 1769 

5-8 17 20 -21 32-33 41 55 58 64 

75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 874 879 893 898 904 
911 918 921-922 925 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
11S9 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Orioin 



RMA Soirrce 



Hyseq 
Library Naae 



SEQ ID NOS: 



adult liver 
adult ovary 



1550 1567 1578 1581 1583 1594 

1597 1601-1602 1611.1612 1615 

1618-1619 1621 1625 1637 1645 

1647 1652 1654-1655 1660 16S6 

1669-1671 1684 1706 1722 1737- 

1738 1742-1744 1760-1761 1753- 
1765 1772 1774 



Clontech 
Invitrogen 



ALV003 



29 676 997 1063 1119 1536 1766 



AOV001 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 165 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-352 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 579 581 S03 585- 
588 590-591 593 595 597 599 601- 
605 607-613 61S 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-6S5 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 78e 790-791 794-796 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11CS-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1191 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 x238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Hyseq 
Library Name 



Tissue Origin RNA Source 



SEQ ID NOS: 



1299 


1306 


1308 


1312 


1317- 


1321 


1323 


1327 


1329- 


•1330 


1332- 


1333 


1338- 


■1339 


1341 


1343- 


-1351 


1356 


1359 


1361 


1365- 


•1366 


1371- 


•1375 


1377- 


-1375 


1383- 


-1384 


1386 


1389 


1394 


1400 


1404 


1416- 


•1417 


1422- 


1427 


1429- 


1431 


1435- 


-1436 


1439- 


1443 


1445- 


1450 


1453- 


-1454 


1459 


1463- 


-1464 


1466 


1468 


1470 


1474- 


1481 


1484- 


1485 


1483 


1491 


1493- 


1494 


1496- 


1498 


1501- 


-1504 


1506- 


1507 


1511-1517 


1519 


1521- 


1524 


1526 


-1527 


1530- 


-1531 


1534- 


-1536 


1538- 


•1539 


1541 


154S 


1548- 


•1550 


1553 


1555- 


1559 


1561- 


•1563 


1566- 


1567 


1569- 


1570 


1572 


1574- 


•1575 


1578 


1580- 


1581 


1587- 


-1588 


1590- 


1591 


1595 


1597-1598 


1600- 


1606 


1609 


1611-1621 


1623- 


•1630 


1634 


1636 


1638 


1641 


1643 


1645 


1647- 


1657 


1659- 


1662 


1664 


1667 


1669- 


1671 


1673- 


1674 


1676- 


-1681 


1683- 


1690 


1699 


1702- 


•1707 


1710- 


•1711 


1713- 


-1714 


1715- 


•1719 


1723- 


1724 


1726- 


•1728 


1731- 


-1733 


1735 


1737- 


1738 


1740- 


1741 


1743- 


-1744 


1748- 


1751 


1753 


1755- 


•1756 


1760- 


1762 


1765 


1767-1768 


1770- 


-1771 


1776 


1778-1779 


1783- 


-1784 


1786 





adult placenta 



Clontech 



APLOOl 



5-8 44-45 90-91 107-108 159 178 
311 351 414 475 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



placenta 



Invitrogen 



APL002 



adult spleen 



14-16 26 29 43 60-6 
106 116 135 171 177 
198 210 216 235-236 
309 329 334 339 359 
423 430 434-435 448 
491 517 522 631 723 
738 746 769 818 843 
858 916 948 9S3-954 
1005-1006 1013 1033 
1068 1070 1086 1139 
1160 1277 1285 1317 
1345 1429 1435 1438 
1486 1490 1512 1519 
1592-1593 1602 1626 
1664 1673 1675 1722 
1746 1776 
3 5-8 12 15 
44-45 57 60 
103 106 108 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574 
611-612 620 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-853 858 



1 79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 988-989 
1036 1064 
■1144-1145 
■1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



GIBCO 



AS POO 1 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434- 
481 490-493 
530 534 536 
576 S82 592 
621 623 631- 
667 671 673- 
730 732 738 
774 780 788- 
822 B30 832 
862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 385- 
436 446 
500 503 
540 547 
595 604 
632 642 
675 684 
742-744 
789 794 
845 848 
879 882 
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SEQ ID NOS: 



Tissue Origin I RNA Source 



Hyseq 
Library Name 



834 906-908 912 919 
927 934 942 949 957 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1112 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485 
1S12 1522 1525 1544 
1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



921-923 926- 
-958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 
-1487 1498 
1549 1553 
1631 1635 
1662 1670 
1686 1700 
1741 1760- 
1782 



testis 



GIBCO 



ATS0O1 



5-8 10 26 30-31 47 
69 82 84-85 97 102 
139 150 1S2 154 156 
176-177 192 194 196 
227-228 247 255 258 
288-289 301 307 311 
349 370-372 392 398 
427 430-431 433 437 
469 473 477 481-482 
503 513 522 526 547 
564 572-573 575-576 
599-602 605 612 615 
637 647 649-650 656 
674-675 712 719-721 
738 744 746 773 780 
802 804 809 811 814 
843 845 848 859 866 
913 916 919 921 926 
960 963 971 975 977 
993 1007 1016 1029- 
1035 1038-1039 1045 
1064 1070 1072-1073 
1097 1099-1102 1104 
1141 1149 1161-1162 
1209 1222 1227 1229 
1238-1239 1243 1253 
1289 1291-1293 1307 
1320 1330 1332 1338 
1373-1374 1379 1389 
1409 1423-1424 1430 
1443 1459 1484 1486 
1496-1497 1501 1505 
1527 1530-1531 1533 
1549 1563 1565 1567 
1577 1586 1591 1599 
1628 1630-1632 1636 
1649 1661-1662 1666 
1675' 1684 1690 1699 
1717 1724 1730 1737 

1767 1779 

686 1352 1412 



50-51 57 68- 
113 119 137 
163 169 174 
197 212-215 
261 282 285 
316 330 334 
410 415 426- 
446 454 461 
493 499 502- 
552-553 563- 
581-582 585 
617 620 631 
660 665 670 
723 728 731 
784 78JB-789 
826 831 837 
869 877 905 
929 937 950 
981 990 992- 
1030 1034- 
1059-1060 
1087 1089 
1108 1113 
1175 1208- 
1231 1235 
1285 1287- 
1311 1317- 
1345 1369 
1399-1400 
1435-1437 
1490 1493 
1509-1513 
1537 1546 
1569 1571 
1602 1625 
1639 1642 
-1667 1670 
1705 1712 
-1738 1752 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITE BAC 
Library) 



BAC001 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



"SEQ ID NOS: 



Genomic DNA 
from BAC 33316 



adult bladder 



Research 
Genetics 
(CITB BAC 
Library) 



BAC003 



TTsT 



Invitrogen 



BLD001 



bone 



Clontech 



5-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 353 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 789 837 840 866 893 898 
909 918 929 966 977 983 1016 
102S 1055 1073 1082 1140 H67 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1S96 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



BMDOOl 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
167 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
23S-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 44S 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 56S 567 
569-577 581 583-586 588 593 601 
503-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-867 869 878-880 
883 890-892 896 9C3 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 992 935 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1251 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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Tissue Origin 



RKA Source 



Clontech 



Kyseq 
Library Name 



BMD002 



SEQ ID NOS: 



1506 

1526 

1546 

15S7- 

1592 

1626- 

1638- 

1653- 

1684 

1713- 

1727 

1772 



1509 
1528 
1543 
1559 
1597 
1628 
1639 
1655 
1686 
1714 
1737 
1781 



TsiT 

1531 
•1549 
1571- 
•1600 
1630- 
1641 
1661- 
1690 
1717 
1738 
1782 



1521 
1536 
1552 
1572 
1609 
1632 
1646 
1662 
1702 
1720 
1740 
1785 



352T 

-1537 
1554- 
1581 
1614 
1634 

-1647 
1676- 
1707 
1722- 
1758 

•1786 



1524 

1543 

1555 

1589- 

1621 

1636 

1651 

1681 

1711 

1723 

1767 



bone marrow 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 566 
569-S70 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 830 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-12S1 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-13S7 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1S36 1546-1549 1560 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



bone marrow 



Clontech 



BMD004 



bone marrow 
adult colon 



73-74 503 922 1036 1711 



Clontech 



BMD007 



95-96 866 1320 1475 



InvjL trogen 



CLXQ01 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 401 485 S03 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Mixture of 16 
tissues - 
mRNAs 
Mixture of 16 



1462-1464 1512 1556 1*83 1587" 

1594 1596 1614 1625-1626 1631 

1639 1645 1650 167S-1677 1687- 

1688 1701 1713-1714 1724 1740 
1765 



Various 
Vendors 



CTL016 



401 1490 1686 



tissues - 
mRNAs' 
adult cervix 



Various 
Vendors 



CTL021" 



312 782 1132-1133 1403 1712 1715" 



BioChain 



CVX001 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 8S 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
1S6 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 557 S61 572-S73 575-577 581- 
582 585-586 588-589 593-594 600 
602 604-6C5 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 80S 818 826 828 831- 
832 834-836 843 847-848 851-855 
8S7-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 102B 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 " 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invjtrogen), 2) norma] adult kidney mRNA (Invirrogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
Onvitrogen) 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Uontech). 10) human leukemia lymphoblastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptionaJ umbilical cord mRNA (BioChain) 
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Tissue Origin 



diaphragm 



endothelial 
cells 



RNA Source 



BioChain 



Hyseq 
Library Name 



DIA002 



SEQ ID NOS: 



1406 1416" 1425-1427" 
1437 1442 1446 1448 
1466 1472 1478 1482 
1503 1506 1512 1522 
1531 1533 1541 1547 
1585 1589 1597-1598 
1609 1614-1616 1620 
1626-1628 1630 1638 
1649 1653 16S6 1662 
1674-1675 1683 168S- 
1702 1709-1710 1715 
1724 1729 1731-1732 
1741 1743-1744 1748- 
1760-1762 1767 1773 
1786 



1431 
1453 
1496 
1527 
1569 
1600 
1623 
1641 
1667 
1688 
1717 
1735- 
1749 
1778 



1436- 
1459 
1501- 
-1528 
1571 
1608- 
1624 
1643 
1669 
1699 
1722 
1739 
1755 
1785- 



137 282 289 730 780 986 1409 
1478 1599 1614 



Strategene 



EDT001 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 108 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 152-1S3 
161-163 166-172 176-179 187 19Q 
1S2 194 196-201 204-207 210 212- 
214 220 224 229-230 233 235-236 
240-241 251-252 258 261-262 265 
267-269 272 276-277 279-281 284- 
285 288 290 295-296 301-302 310- 
311 313 316 321 325 329 331-333 
335 340-342 351-355 360 371 375 
380-382 364 387 390 392 397 400 
407-408 410 412 414 416 425-427 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
519 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 
572-576 579 581 585-586 589 593 
595 597 599 603 607-612 615-617 
620 622 626 630 632-634 638-641 
644 647 656-660 662-664 670 673 
678 680-682 692-697 707 709-710 
712-713 719 730 732 734 736 738 
743-746 751 759 768 771 773 775- 
778. 783 786-789 793 800 8C3 805- 
807 810-811 814 816-818 821-822 
824 826 828-829 832 834-838 842- 
845 848-850 854-860 862 864 869 
871 874 876-879 883 885 887 890- 
891 894-895 898-900 903 908 910- 
913 916 919-922 924 926-928 930- 
935 939 943 948-949 951-954 957 
959-961 964 969-970 973 975-978 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 1205-1207 1211 
1216-1217 1219 1221 122S 1229 
1232-1235 1238-1241 1243-1244 
1246 1250 1253 12S7-1258 1261 
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Tissue Origin 



RKA Source 



Hyseq 
Library Name 



SSQ ID NOS: 



1265-1266 


1268 


12 70- 


1271 


1274- 


1277 


1280- 


•1283 


1285- 


-1286 


1288- 


1290 


1293 


1295 


1298 


1308 


1312 


1317- 


132C 


1324- 


■1325 


1327 


1329- 


1330 


1334- 


•1335 


133 8 


1342- 


-1343 


1345- 


1347 


1350 


1355- 


-1356 


13S9 


1367 


1369 


1374 


1376 


1379 


1398 


1400 


1406 


1408 


1414 


1417 


1419 


1424- 


•1426 


1428- 


-1431 


1434- 


-1438 


1440- 


•1442 


1448 


1450 


1462- 


-1466 


1468 


1472 


1474 


1478 


1487- 


-1488 


1491- 


•1493 


1501- 


-1504 


1506 


1509 


1511 


1516 


1520- 


-1521 


1526 


1529 


1S31 


1536- 


■1537 


1539- 


-1540 


1546- 


1547 


1549 


1552 


1555 


1557- 


-1559 


1561- 


•1SSS 


1568 


1571 


1575 


1578- 


1579 


1581- 


15B3 


1587- 


-1588 


1590 


1592 


1597 


1605- 


•1606 


1611 


1613 


1615 


1618- 


-1621 


1624- 


-1628 


1630- 


1631 


1634 


1636 


1638 


1641 


1643- 


1650 


1652- 


-16S9 


1664 


1666 


•1667 


1669 


1671 


1675- 


•1681 


1683 


-1688 


1696- 


•1698 


1703 


1711 


1715 


-1716 


1719 


1722- 


•1723 


1726 


1731 


-1733 


1736 


1739- 


-174X 


1743- 


-1744 


1749 


1755 


1760 


-1761 


1765 


1767 


-176B 


1771- 


-1773 


1776 


1779 


1783 


-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



286 686 1297 1303-1304 1352 
1411-1412 1754 



131-132 261 289 380 503 860 892 

1000 1007 1397 

62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



esophagus 
r"etal brain 



BioCham 
Clontech 



BSO002 



FBR001 



Clontech 



fetal brain 



FBR004 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1S93 



fetal brain 



Clontech 



FBR006 



5-9 25 43 60 62-63 65-66 70 72 
80 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 S47 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 638-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-697 699 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
806-807 810 817-818 826 839 843 
858 861 864 871-872 884 890-891 
894-895 898 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



999 1001 1005- 
1016 1022 1024 
1035 1042 1047 
1065 1067 1070 
1114-1115 1119 
1151 1153-1156 
1172-1373 1178 
1190-1200 1211 
1226-1227 1229 
1253-1255 1258 
1270-1273 1281 
1314 1317-1320 
1339 1341 1344 
1371 1373 1376 
1386 1392 1396 
1425-1426 1428 
1440-1441 1448 
1S02-1503 1507 
1519 1536 1544 
1559 1573 1589 
1611-1614 1619 
1640 1651 1657 
1693 1696 1703 
1718 1720 1722 
1730-1733 1735 
1742 1745 1755 
1767 1771-1772 
1786 



1006 1008 1013 

1029-1030 1032 
-1048 1052 1056 
1082 1089 1109 
1131 1143-1149 
1160 1163 1167 
1184 1186 1188 
1216 1222-1223 
1231 1236 124S 
1260 1262 1266 
1287 1308-1309 
1326 1334-1335 
1350 1356 1369- 
.1379 1381-1382 
-1398 1419 1423 
-1429 1432 1437 
1466 1470 1482 
1511 1513 1516 
1549-1550 1557- 
-1590 1598 1608 
1621 1625-1626 
1658 1676-1679 
1704 1713-1714 
1724 1726 1728 
1736 1738-1739 
1759*1761 1765 
1777 1779-1780 



fetal brain 



Clontech 



FBRS03 



235-23* $20 864 10*8 11*8 1587 



fetal brain 



Invitrogen 



FBT002 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 419-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 8S4-8S5 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 111S 1118 1120*1128 
1136-1137 1144-1145 1149 1156- 
11S7 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 



Invitrogen 



PHR001 



105 124 ISO 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



PKD001 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin 



fetal kidney 



RNA Source 



Hyseq 
Library Name 



~SEQ ID KOS: 



258 277 280-281 307 310 314 320 
371 337 392 39S 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 780 
798 821 833 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1785 



Clontech 



FKD002 



fetal kidney 



352 384 424- 427 440 583 602 1060 
1131 1324-1325 1636 



Invitrogen 



FKD007 



fetal lung 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



Clontech 



FU3001 



fetal lung 



35-36 94 323 371 398 426^427 445 
473 549 560 604 616-S17 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



Invitrogen 



JLG003 



fetal lung 



Clontech 



9 15-16 29 41 47 68-69 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 2S4 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 425-427 430 432 467- 
468 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-989 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 1381 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1557-1560 1557 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



FLG004 



fetal liver - 
spleen 



Columbia 
University 



FLS001 



103 276 334 
1614 1658 
3 



465-466 737 843 1131 



11 13 15 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 186 188 
200 202-206 
233-236 240 
255-256 258 
274 276-278 
293 295 299- 
311 314 316 
332 342 344- 
358 360 362 
386-387 390 
406 408 410- 
437 439-442 
456 459 461- 
487-488 490- 
506 509-513 
529 531 534 
553-554 S61- 
576 579 581 



21 25 30-39 41-48 SO- 
60-66 68-69 72 75 
85 87 89 92-103 105- 
-124 126-127 130 133 
144 147-149 152-153 
167-172 174 176-178 
-190 193-194 196 198- 
210-214 219 221-231 
-244 246-247 250-251 
261-265 268-269 272 
280-281 284-286 288 
■301 304 306-307 309 
318 320-321 326 329- 
345 350 352-353 3S6- 
370-374 376 378-384 
392-393 400-401 403 
412 415 417 419 422- 
444-445 448 452-454 
470 472-479 481-483 
491 493 500-501 503- 
515-520 522-524 526- 
536-540 542 547-549 
562 564 567-558 571- 
583 585-597 599-605 
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! Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NO 


S: 








Library Naiae 




















607 610-613 615 


-621 


623-624 626 








628-634 636-640 


644 


647-650 655- 








660 665 669-670 


672 


674-675 678 








681-6 


82 684 690 


-695 


697 702 708- 








710 713-714 716 


-719 


725-728 730- 








731 734 736 738 


740- 


741 743-746 








748 750-751 7S9 


-766 


768 772 734- 








777 779 783-788 


793 


796 1 


98 800- 








80S 808 810-812 


814 


818-6 


19 821- 








824 826-832 834 


-837 


843-847 849- 


• 






867 869-876 878 


-883 


887 e89-895 








897-898 902 904 


-914 


916 919 921- 








928 930-937 939 


945- 


950 953-958 








960-961 963-965 


967 


9S9 971 974- 








978 980-983 936 


988- 


990 992-993 








99S-997 1000-1002 1004-1008 1012 








1014 


1016- 


1019 


1025-1026 


1028- 








1031 


1033 


1035- 


1036 


1039- 


1044 








1047 


1049- 


1050 


1053- 


1056 


1058- 








1059 


1061- 


1064 


1067- 


1070 


1072- 








1074 


1076 


1078 


1082 


1085- 


•1087 








1089- 


-1090 


1097 


1099- 


•1103 


1107- 








1113 


1115- 


1119 


1121- 


-1123 


1125 








1127- 


•1128 


1131- 


1134 


1136- 


•1137 








1144- 


•1150 


1153 


1159-1160 


1163 








1170 


1175 


1177- 


1178 


1188 


1190- 








1192 


1195- 


1200 


1202 


1206 


1208- 








1211 


1214 


1216 


1218 


1221-1222 








1225 


1227 


1234 


1237 


1241 


1244 








1246- 


-1247 


1251 


1254 


1258 


1261 




• 




1266 


1268 


1270- 


1273 


1277-1282 








1284 


-1285 


1287- 


1290 


1294 


1299- 








1300 


1306- 


-1308 


1313- 


-1320 


1324- 








1325 


1327 


1330 


1332-1333 


1338 








1341 


1343 


134S- 


1347 


1349 


-1350 








1353 


-1360 


1362- 


1363 


1365* 


-1367 








1369 


-1370 


1372- 


1374 


1376 


1378- 








1381 


1383- 


-1384 


1386 


13 89 


-1391 








1400 


1402-1403 


1405 


-1410 


1413 








1415 


1417- 


-1419 


1422 


-1429 


1431 








1435 


-1437 


1439-1442 


1445 


-1446 








1448 


-1449 


1454 


1458 


-1459 


1466- 








1470 


1472 


1474 


1477 


-1478 


1480 








1482 


1485 


1491-1493 


1496 


-1498 








1501 


-1507 


1509 


1S11 


-1512 


1S16- 








1519 


1524 


-1526 


1529 


1532 


1536- 








1541 


1S46- 


-1547 


1549 


-1550 


1552- 








1554 


1562 


1564 


1569 


1572 


1574- 








1575 


1578 


1581 


1583 


1587 


-1588 








1591 


-1592 


1594- 


1595 


1597 


-1598 








1600 


-1604 


1611- 


1612 


1614 


-1615 








1617 


-1618 


1620; 


•1622 


1624 


-1625 








1627 


-1628 


1630- 


1632 


1634 


-1639 








1645 


-1651 


1653- 


•1662 


1664 


1667- 








1669 


1671 


1673- 


1674 


1676 


-1688 








1690 


1696 


1701- 


•1703 


1706 


-1709 








1711 


1713 


-1714 


1718 


-1719 


1722 








1724 


-1727 


1731- 


•1733 


1738 


1740- 








1741 


1743 


-1744 


1746 


1748 


1751- 








1752 


1754 


1760- 


•1765 


1767 


-1773 








1780 


1783 


-1786 








fetal liver- 


Columbia 


PLS002 


3-11 


13 15-21 26 29 


32 35-39 42 


spleen 


University 




44-45 48 


50-51 


54-55 57- 


58 61 54 






68-69 73- 


75 78 


80 82 84 


87 9S-98 








100 


103 105 107-108 


110 


112-113 








116- 


119 122-125 128 


13 0 


137-138 








145 


147-153 155 157 


159 


161-163 








166 


168 171-172 174 


-175 


177 181 








188- 


189. 193-194 196 


-198 


200-203 



124 



WO 01/53312 



PCT/US00/34263 



Tissue Oricrin 



Hyseq 
Library Name 



SEQ ID *foS: 



RNA Source 



206 212-215 219-221 
231-232 240-244 246 
258-259 262 264 268 
277 280-281 284 286 
295 298-299 301-304 
318 320-321 323 325 
342 348-349 352-353 
371 374 376-379 381 
392-393 397-398 400 
413 421 423 426-427 
436 438 440.443 445 
454-455 460-463 465 
473 475-476 478-479 
490-491 493-494 497 
505 509-513 515-517 
526-531 534 537-542 
554 556 558 561-562 
577 583-587 590-591 
601 604-606 608-613 
624 626-632 634 637 
649-652 654-659 662 
674-67S 681-682 685 
698 700-703 707 709 
719-721 723-724 728 
737-738 742-745 748 
763-766 768 770 773 
784 786 791 795-798 
30B 811-812 818 823 
832 834-837 839 843 
358-861 865 867 869 
876 878 881-382 887 
898 901-902 904 906 
919 921-924 926-932 
939-941 943 946-947 
961 96S-967 971 973 
981 984-985 990 992 
999 1001 1004-1007 
1013 1016 1020 1023 
1031 1033-1035 1039 
1045 1049 1053 1055 
1059 1062 1064-1065 
1072-1074 1079 1082 
1093 1097 1099-1103 
1109-1114 1123 1125 
1134 1140 1143-1145 
1156 1158 1160 1163 
1177-1178 1181-1184 
1195-1197 1199 1204 
1211 1214 1216 1219 
1234-1235 1237 1240 
1245 1247 1256 1258 
1264 1268 1270-1271 
1279 1284-1286 1288 
1301 1306 1308 1312 
1319 1323-1325 1327 
1335 1339 1343-1347 
1354-1355 1357 1360 
1365-1367 1369 1372 
1380 1386 1389-1391 
1403 1406 1409 1416 
1427 1429 143S 1437 
1442 1446 1448-1450 
1461 1468 1470 1472 
1478 1482 I486 1490 
1498 15C0-1504 1506 
1511-1512 1516 1518 
1524-1S28 1531 1536 
1547 1550 1554 1556 
1569 1580 1587-1588 



223 225-229 
-247 250-251 
-269 272 275 
283 290-292 
306 308-310 
329 331 334 
356 359 368 
-384 386-387 
-401 403 410- 
429-430 433- 
448 451-452 
467 469 471- 
481-483 487 
500-501 503- 
519-520 524 
544 54 7 552- 
564-567 571- 
593 595 597 
616-617 619- 
642 644 647 
665 669-672 
688 690 696 
-710 713 717 
731-732 734 
752 754 759 
-777 780. 782 
801-802 805 
-824 826-827 
846 848-856 
871 873-874 
889 892 894- 
-908 913-915 
934-935 937 
950 9S3 958 
-975 977-979 
-993 995-997 
1009-1011 
1025 1027- 
1042 1044- 
10S6 1058- 
1067-1070 
1087 1089 
1105-1107 
1127 1132- 
1148-1150 
1172-1173 
1190-1192 
1206 1208 
1227 1230 
-1241 1243 
1260-1261 
1275 1278- 
-1289 1299- 
1314 1317- 
1330 1334- 
1349-1350 
1362-1363 
1376 1378- 
1394 1400 
-1419 1422- 
-1438 1440- 
1453 1460- 
1474-1475 
1493 1496 
1508-1509 
-1519 1521 
1538 1543 
1564 1567- 
1591-1592 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1597 


-1598 


1600- 


■1601 


1611- 


1612 


1618 


-1628 


1530-1631 


1635- 


1638 


1641 


1646- 


1649 


1652 


1654- 


1659 


1661 


-1662 


1664 


1667 


-1669 


1674 


1676 


-1675 


1633- 


-1684 


1666- 


1688 


1691 


-1692 


1699 


1702 


1707 


1711 


1713 


«1714 


1717 


1719 


1722 


1726- 


1727 


1730- 


•1733 


1738 


174 0 


1743- 


1744 


1748- 


1752 


1753 


1760- 


1761 


1763 


-1764 


1767 


1769 


1772- 


1773 


1776 


1779 


1783- 


1786 







fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



fetal liver 



Invxtrogen 



FLV001 



15-16 26 24 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 165 176 180 194-196 198 200 
204-205 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
441-442 465-466 490 494 504- 
509 522 527 534 



435 
506 



552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 037 8S7 861 972- 
873 875 881 889 894-89S 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 10B6-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 14*3- 
1464 1469-1470 1489 1S28 1536 
1539 1549-1550 1S57-1S62 1577 
1583 1598 1601 1611 1615 1622 
1544 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver"" 
fetal liver 



Clontech 



FLV002 
FLV004 



676 998 1719 



Clontech 



93 133 214 301 355 
581 601 679 837 847 
1236 1270 1313 1324 
1355 1367 1425-1426 
1733 1760-1761 



374 379 555 
859 1123 
1325 1327 
1536 1690 



fetal muscle 



Invitrogen 



FMS001 



26 37-39 50-51 SB 84 86 89 98 
113 128 131-132 139 155 172 186 
194 198 201 206 211 230-231 256 
261 276 282 286 302 32S 359 361 
375 379 383 398 412-413 415 430 
436 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 788 
826 837 860 874 913 915 921 935 
970 980 986 988-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 1070 1083 1097 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



T099: 
1173 
1266 
1324- 
1383- 
1433 
1557- 
1632 
1712 
1766 



1228 
1298 
1336- 
•1400 
1542 
1589 
1652 
1743- 



1102 
1198 
1270 
1325 
1384 
1505 
1559 
1644 
1725- 



1116 
1208 
1277 
1329 
1399 
1514 
1562 
1650 
1726 



1121 

1240 

1317- 

1337 

1403 

1551 

1599 

1671 

1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscle 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skin 



Invxtrogen 



1 4-11 15-16 20-23 25 29 33 40 

43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
913 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1S21 1S25-1526 1529 1535- 
1536 1547 1549 1557-1S59 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 



FSK001 
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Tissue origin 


RNA Source 


Hyseq 




SEQ 


ID NOS: 








Library Name 


















1626 1632 1634 


1636 


1641 1643- 








1644 1646 1654- 


1657 


1660-1662 








1655 1668 1675 


1685 


1687-1689 








1702-1703 1709- 


1710 


1716 1719 








1724 1727 1731- 


1732 


1737-1740 








1742 1747 1749 


1755 


1760-1761 








1765 1772 1776- 


1777 


1779 1780 








178S 








fetal sJtxn 


Invitrogen 


FSK002 


13 285 302 307 


313 


321 330 335 








339 


341 354 370 


372 


385 


400 402 








408 


414 426-427 


433 


436 


4S0 454 








515 


544 585 598 


767 


810 


845 939 








1076 1109 1155 


1317 


-1320 1326 








1333-1335 1343 


1347 


1350 1369- 








137i 1377-1378 


1391 


1397 1422 








1466 1647 1656 


1678 


-1679 1687- 








1688 1693 1718 


1721 


1725 1731- 








1732 1739 i755 








fetal spleen 


BioChain 


FSP001 


110 


137 211 353 


589 


927 


1108 








1639 1771 








umbilical corcl 


3ioChain 


FUC001 


4-8 


10 12 14 17 


33- 


36 44-46 57 








64 68-69 75 82 


85 101 104 113- 








114 


116 119 122 


-124 


133 


137 1S3- 








154 


157 161 163 


166 


-167 


175 181- 










186 192 197 


-198 


200- 


202 212- 










230 234 246 


-247 


251 


256 263 








OCT 


271-272 280 


-281 


284 


295 301 










317 321 326 


333 


-335 


345 351 










368 371-373 


379 


-380 


386 390 








392 


394 406 408 


-410 


412 


414 416 








420 


424 427 430 


-436 


438 


444-446 








454 


459 4 61 463 


467 


473 


482-483 








486 


488 490 495 


504 


509 


524 526 








537- 


540 547 555 


561 


574- 


577 588- 








591 


593 606 615 


620 


-621 


632 637 








645- 


647 650 659 


-660 


662- 


664 667- 








668 


674-675 684 


687 


696 


698 701 








703- 


705 709 711 


714 


719- 


720 725- 








727 


732 749-750 


762 


765 


771 775- 








777 


780 789-791 


793 


796 


802-803 








814- 


817 822 833 


843 


845 


848 958 








861 


864 875 879 


888 


894- 


895 897- 








900 


903 906-907 


911-912 


925 930- 








933 


936 940 948 


953 


960 


966 977 








984 


990 992 998 


1000-1001 1005- 








1007 


1016 1023 1025 


1037 


1046- 








1047 


1059 1061-1063 


1073 


1076- 








1077 


1089 1094-1097 


1112 


-1113 








1115 


1134 1144-1148 


1151 


1154 








1156 


1163 1171 1197 


1204 


-1205 








1208 


1216 1218 1224 


1234 


-1235 








1243 


-1244 1246 1279 


1283 


1286- 








1287 


1298 1316 1320 


1344 


1346 








1350 


1357 1359 1371 


1373 


1375 








1381 


1398 1400 1403 


1408 


1414 








1424 


1427-1423 1431 


1433 


1440- 








1442 


1446 1454-1455 


1479 


1482 








1484 


-1485 1489 1492- 


1493 


1504- 








1505 


1513 1525 1527 


1536 


1538 








1546 


1565 1567 1571 


1573 


1575- 








1576 


1578-1579 1591 


1595 


1600- 








1601 


1608 1612 1615 


1621 


1624 








1626 


1636-1637 1647- 


1648 


1651 








1653 


1656 1658 1661- 


1662 


1672 








1675 


1682 1684 1686- 


1668 


1690 








1709 


-1710 1722 1727 


1729 


1735- 








1738 


1740-1741 1760- 


1761 


1768 


fetal brain | 


GIBCO 


HFB001 


4 9 11-13 17-18 


22-23 25 


37-39 








42-47 50-51 54-55 58 


60-61 65-66 , 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 








Library Name 




















72 75 77 80 


32 85 90-91 


94 100- 








102 


107 


110 


112-116 


118-119 122- 








123 


126 


128 


134 136 


-140 


147-148 








153- 


•155 


1S7 


161 165 


169- 


172 175 








181 


1S6 


188 


-189 197 


-198 


204-206 








208 


210 


215 


222-223 


225- 


226 230 








23 5- 


238 


240 


-241 247 


253 


256-258 








260- 


262 


267 


-269 276 


279- 


281 284 








286 


289 


298 


300-302 


307 


310 318 








321- 


323 


32S 


330-331 


339 


341 346- 








349 


352 


354 


356-359 


3 62 


364-365 








371- 


372 


377 


379-380 


382 


384 387 








390 


400 


408 


414-416 


419 


424 431 








434- 


435 


438 


441-443 


449 


4S1 453- 








455 


457- 


-463 


470 472 


-473 


475 477- 








478 


482- 


•483 


486-488 


490- 


491 493 








496 


499- 


-500 


502-504 


506- 


507 509- 








512 


516 


519- 


-520 522 


525- 


526 529- 








530 


537- 


•540 


543-544 


546- 


547 566- 








567 


569- 


-570 


572-582 


585 


588 590- 








591 


59 J 


595 


599 601 


604 


606-609 








611- 


612 


614- 


-620 622 


-624 


630 632 








636 


643 


645- 


-647 650 


-652 


654 659 








661 


665 


667- 


-668 670 


-672 


676 678 








681 


687 


689 


692-694 


697 


699 710 








714 


717 


721 


727 729 


-732 


734 736 








738 


743-746 


750-751 


759 


763 766 








770 


772 


775-777 7B4 


789 


791 796 








799 


802-805 


810-811 


814 


819-821 








824 


826 


830 


834-837 


839- 


850 854- 








856 


858- 


860 


862 864 


869 


871 876- 








877 


879 


883 


886-887 


890- 


891 893- 








895 


898- 


901 


905 908 


-910 


912-916 








919 


922- 


923 


925 927 


930- 


933 935- 








938 


948 


952- 


960 963 


-964 


967 969- 








972 


975 


978- 


979 981 


983 


986-987 








990 


992 


995 


997 999- 


-1002 


1005- 








1009 


1011-1013 1016 


1018 


-1019 








1023 


1026 1029-1031 


1033 


-1035 








1038 


1041 1047 1050 


1053 


1057 








1059 


1064 1068 1070 


1072 


-1073 








1078 


-1079 1081-1082 


1086 


1089 








1094 


1097 1103 1107- 


-1109 


1113- 








1115 


1121-1122 1127 


1134 


-1135 








1138 


1140 1143 1148- 


•1151 


1153 








1156 


-1157 1159 1167 


1170 


1175 








1193 


-1194 1200 1202 


1207 


-1209 








12X1 


1216 1219-1220 


1226 


-1227 








1229 


1232-1234 1240- 


1241 


1243 








1246 


1249-1251 1253- 


1254 


'1258 








1267 


-1268 12 


71 1276 


1279 


1282 








1285 


-1289 1293-1294 


1305 


1307- 








1308 


1312 1316 1320 


1327 


1338- 








1339 


1341-1344 1346 


1349 


135S- 








1357 


1359 1365-1366 


1369 


-1370 








1373 


-1375 1379 1386 


1389 


1394 








1398 


1409 1413-1414 


1416 


-1417 








1420 


-1421 1425-1427 


1430 


1433 








1437 


1439 1442 1445- 


1452 


1454- 








1457 


1459 1463-1464 


1468 


1470 








14 74 


1477-1479 1489 


1492 


1494 








1497 


-1498 1501-1503 


1507 


1509 








1511 


-1513 1517 1S20- 


1521 


1524- 








1526 


1531-1533 1535 


1537 


-1538 








1547 


1554 1556-1559 


1564 


-1567 








1571 


1584 1587 1589 


1594 


1SS9- 








1601 


1611-1612 1614- 


1616 


1619- 








1620 


1625-1628 1630- 


1631 


1634 








1637-1638 1640-1643 


1645 


1648- 
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Tissue Origin 



acrophage 



infant brain 



RHA Source 



Invitrogen 



Columbia 
University 



Hyseq 
Library Name 



SHQ ID NOS: 



1649 16S1 1653-1655 1657-1658 
1664-1665 1667 1669 1673 1678- 
1679 1683-1684 1686 1693 1701 
1704-1705 1709 1713-1714 1717- 
1720 .1724 1727-1728 1731-1733 
1737-1738 1743-1744 1752 1754- 
1755 1757 1760-1761 1765 1772 
1779 1785 



™ ? 0 0 1 ~ I 5-8 113 204-205 503 634 678 859 
_— _ 878 933 988-989 1379 144 8 1504 
IB2002 [ 10 12-13 15-18 22-23 25 29 3"4 

37-39 43 47 50-51 54-56 58 50-63 
65-66 68-69 72-74 80 82-83 86 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 128 130 
134-136 138-139 143 147-149 151- 
152 154-155 163 165-167 169 172 
175 181-184 186 193-196 198 201 
203-205 209-210 214-215 222 224- 
226 231-232 235-236 239 246-247 
252 257 260 263-269 272 276-277 
279-281 286 288 291-292 295 298 
300-301 304 307 310 313 321-323 
330-331 333-334 339 345-347 349 
352 356-357 362 371-372 377 379 
380 383-384 392 397 401 406 408 
411 413-414 416 418-419 422 428 
430-431 434-435 438 443 449 453 
454 461 464-466 469-470 472-473 
475-476 478 482-483 487 490 492 
494 497 503 507-508 510-513 516 
519-520 524-526 530-534 536-540 
547 550-551 561 563-564 566-567 
572-576 579 581-582 584-587 590- 
591 593 595-597 607-609 611-613 
616-617 620 622-624 627 631 637 
641 645-647 650-655 657-658 660- 
665 667-675 609 691 695 697 699 
703 707 713-715 717 721 728-731 
733-736 739 743 745 751 755 759 
763 769-770 772 778 780-781 785 
7BB-789 793-794 799 803 808 811 
814 825-826 830 834-836 840-843 
845 848-850 854-855 860 862 864- 
865 870 872 875-876 878 886 883 
890-891 894-896 898 903-904 916- 
917 919 922-925 927-928 930-932 
934-936 938 941 945-946 948-950 
953-9S4 959-962 966-969 977 979 
981 986-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-1082 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1188 
1190 1193-1194 1196-1197 1199 
1204 1208-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1256 1258 1261- 
1262 1269 1274 1279 1281 1283 
1285 1287-1289 1294-1255 1305 
1307 1313-1314 1316-1320 1329 
1332 1341-1342 1345 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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Tissue Origin | RNa Source 



Hyseq 
Library Name 



' SEQ ID NOS : 



mtant brain ( Columbia j IB200T 

University 



1423 1429-1431 1435-1436 1439- 
1441 1443 1447-1449 1451-1452 
1454-1455 1457 14S9 1463-14S5 
1468 1470-1471 1475 1479 1482- 
1483 1485 1493-1494 1496 1490- 
1499 1502-1503 1505-1507 1509 
I 1522-1523 152S 1528 1531-1533 
I 1542 1546-1547 1549-1550 1554- 
1555 1563 1565-1567 1569 1575 
1580 1583-1586 1588 1590 1592- 
1593 1595 1598 1600-1601 1608- 
1610 1612 1614-1616 1619 1621 
I 1624 1626-1627 1630-1633 1637 
1639-1640 1642 1644 1647 1652 
1654-1655 1658-1659 1664-1665 
1672-1673 1676-1681 1685-1688 
1693-1695 1701-1702 1704 1708 
1717-1720 1723-1724 1736-1728 
1733 1735-1741 1743-1744 1752 
1755-1758 1762 1765 1771 1774 
1777-1778 1786 



injcant brain j Columbia — 
University 



17-18 20-23 29 34 43 60 68-69 

78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 22S-226 229 235-236 247 260 
276-281 286 290-292 2BS- 300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 375-377 382 
384 403 408-409 414-415 453-455 
[ 472 476 478-479 490 503 507 516 
520 530 534 536-540 551 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
675 678 609 715 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-1289 1305 1314 1327 1333 
1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1536 1546 1557 
1559 1567 1572 1587 1595 1598 
1610-1612 1515 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



infant brain | Columbia" — 
Phi vers It y 



IBM002 | 101 113 139 152 26 0 279 290-292" 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



IBS ° 01 I 10 12 119 175 279-281 321 334" 
1 371 446 551 S63 623 652 667 669 



131 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 










Library Name 
























"671 


-£72 


819 


949 


966 


1113 1130 








1151 11 


88 1193- 


1194 


1196 1229 








1258 1265 1271 


1287 


1317-1319 








1324-1325 1342 


1423 


1440-1441 








144 


a 14 


71 1482 


1525 


1532 1546 








1562 1569 1588 


1591 


1610 1618 








164 


7 16 


49 1658 










lung, 


Strategene 


JbFBOOl 


5-9 


17 


20-2 


1 25 


63- 


59 82 94 


105 


fibroblast 






153 


157 


197 


-198 


203 


207 


-208 


212- 








213 


223 


262 


266 


233 


302 


321 


326 








333 


356 


370 


427 


430 


436 


446 


462 








472 


493 


498 


503 


516 


519 


527 


535 








537 


-540 


542 


-544 


562 


565 


567 


586 








599 


-600 


607 


615 


630 


647 


662 


-664 








692 


-694 


712 


719 


745 


748 


775 


-777 








794 


-796 


S10 


837 


843 


-847 


849 


854- 








856 


869 


876 


903 


934 


953 


955 


-956 








964 


975 


-976 


984 


1000 1005-1007 








1024-1025 1033 


1039 


1053 1064 








1070 1072 1082 


1112 


-1113 1134 








1136-1138 1140 


1195 


1223 1232- 








1233 1246 1279 


1285 


1295 13 


LI 








1320 1334-1335 


1343 


1427-1428 








1446 14 


78 1482 


1493 


1504 1537 








1552 1555 1567 


157S 


1582 1598 








1620 1625 1632 


1638 


1645 16 


54- 








1655 16 


52 1680- 


1681 


1684 1686 








1690 16 


96 1702 


1711 


1733 1741 








1760-1761 1778 


1785 








lung tumor 


Invitrogen 


LGT002 


5-10 18 


20-21 29 33 


-36 40 43 52 








54-55 61 65-66 


68-70 73-75 80 85 








8B-89 93-94 


100 


103 


106-108 


112- 








113 


115-116 


118 


-119 


123- 


-124 


126 








130-132 


135-137 


139-141 


143-144 








147-148 


151-153 


155-156 


159 


161 








154 


169 


171 


179 


-180 


185 


190 


192 








194 


196 


-199 


203 


-208 


210 


212 


-214 








216- 


-217 


219 


222 


233 


240- 


•241 


244 








246 


251 


-252 


255 


-256 


261- 


•262 


266. 








272 


276- 


-277 


279 


-281 


284 


286 


288 








290 


295 


298 


301 


-302 


309- 


•312 


317 








321 


329 


332 


341 


-342 


344- 


•345 


348 








352 


358- 


-360 


363 


368 


370-371 


376 








380-381 


384 


389 


-390 


398 


400 


409 








414 


423 


426-427 


430 


432-436 


443- 








444 


450-451 


454 


462 


468 


472- 


•477 








480- 


483 


487- 


-488 


490- 


•491 


493 


496- 








498 


500 


503-506 


509- 


•512 


515- 


•516 








519 


521- 


523 


526 


530 


534 


541 


544 








547 


554 


557 


564 


566- 


567 


S72- 


•576 








585- 


586 


588- 


•589 


595- 


596 


601 


607 








611-612 


615 


619 


621 


623 


626 


630 








632- 


633 


644 


647 


649 


651 


655- 


656 








660 


662- 


665 


667 


659 


672 


683- 


684 








696 


700 


706 


710 


713 


716 


718- 


•719 








722- 


723 


728 


73 4 


-739 


743 


7S0 


752 








763 


765- 


766 


773- 


-778 


784- 


785 


787- 








789 


791 


800 


802-803 


809- 


812 


814 








824 


826 


828- 


629 


832 


838- 


839 


841- 








845 


849- 


350 


852-8S5 


857- 


861 


864 








866 


874 


878- 


880 


882 


887 


890- 


891 








897- 


898 


902 


904 


906- 


907 


910 


916 








918- 


920 


922 


924-925 


927 


930- 


932 








934- 


935 


937 


947 


950 


953 


955- 


956 








961 


963 


966- 


967 


969 


971 


977- 


979 








981 


984 


986- 


987 


990 


992- 


993 


995 








9S7 


999- 


1001 


1005-1007 1009 










1012 


-1013 1018 1020 


1022 


-1024 








1026 


1029-1030 1033 


1038 


1041 
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lymphocytes 



ATCC 



LPC001 



1045 
1059 
1074 
1097 
1116 
1139 
1152 
1172 
1202 
1222 
1257 
1278 
1289 
1317 
1344 
1357 
1383 
1403 
1431 
1448 
1470 
1488 
1508- 
1519 
1540 
1561 
1591 
1602 
1624- 
1644 
1656- 
1671 
16B5- 
1705 
1730 
1748- 
1767 
1778 



1047-1050 
1063-1064 
1078 1085 
1104 1106 
1117 1119 
1141-1142 
-1153 1156 
1178 1195 
1204 1208 
1227 1234 
-1258 1265 
1280-1281 
1295 1300 
-1321 1329 
•1346 1349 
1365-1366 
1385 1394 
1408 1417 
1433-1436 
14S4-1455 
1474 1480- 
1490-1491 
1509 1511- 
1523-1524 
1546 1549- 
1S6S 1567 
1593-1594 
1608 1614- 
1625 1627- 
1645 1647- 
1662 1664 
1673-1675 
1688 1690- 
1709 1716- 
1735 1739 
1749 1753 
1770-1771 
1779 1786 



1052 1054 
1067-1071 
1087 1089 
1107 1109 
1126 1134 
1144-114S 
-1158 1167 
-1196 1198 
1214 1216 
1241 1247 
1267-1270 
1283 1285 
1305 1308 
1338-1339 
•1351 1353 
1369 1378 
1397 1400 
1419 1423 
1438 1444 
1460 1466 
1481 1483 
1494-1496 
1512 1515- 
1528-1529 
1550 1555 
1569 1575 
1596-1598 
1616 1618 
1632 1636 
1649 1652- 
1666-1657 
1678-1679 
1692 1696 
1717 1722 
1741 1743 
1760-1762 
1773 1775- 



1055 
1073- 
1095- 
1112 
113S 
1148 
1170 
-1200 
1219 
1252 
1276 
1288- 
1312 
1341 
-1355 
-1379 

1402- 
-1426 
we- 
lded 
1486- 
1506 
-1516 
1S36- 
1560- 
1588 
1600- 
1620 
1639 
1653 
1670- 
1683 
1699 
1727 
1744 
1765 
1776 



4 11-12 18 24-25 30-31 48 50-51 — 
56-57 68-69 80 92 98 103 105 110 
126 137 152-153 157 165 172 188- 
189 197 203 210 217-218 222-223 
225-226 229 231 247 251 256 264 
272 280-281 284 300-301 321 325- 
326 339 348 352 357 371 382 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-448 451 454- 
455 475 503 516 526-527 530 537- 
540 549 556-560 563 574 577 589 
602 613 615-617 621 623 628-630 
636-637 647 649 657-659 690 697 
717 723 755 764 775-777 780 786 
789-790 793 800 802 822 838 849 
866 869 876 881-883 892 898 906- 
907 911 921-923 928 975 990 992 
996 1001 1004-1007 1033 1050 
1054 1078 1107 1135 1140-1141 
1143 1148 1158 1163 1177 1199 
1205 1216 1226 1231 1236 1241 
1244 1250 1258 1260 1265 1269- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1369 1379 
1381 1383-1384 1386-1387 1389 
1394 1397 1405 1423 1425-1428 
1431 1437 1446 1448 1461 1466 
1470 1472 1474 1482 1492 1506 
1528 1537 1546 1549 1591 1598 
1600 1603-1604 1606 1627 1636 
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RNA Source 


Hyseq 




SEQ 


ID NOS: 










Library Name 






















1638 1647-1649 


1651 


16S8 


-1659 








1664 1676-1677 


1680-1681 


1687- 








1688 1699 1711 


1715-1716 


1726 








1728 1737 1740 


1746 


1748 


1752 








1756 1758 1777 


1779 








leukocyte 


GIBCO 


LUC001 


3-4 10-11 13 15 


-18 20-21 


24- 


•25 








30-31 35-36 


40 


43-45 48 


50-51 








54-58 60-63 


68- 


69 75 79- 


80 82-83 








85 88-9] 


. 93- 


•96 


98 100 103-104 








107-108 


112 


116 


119 


123 


125- 


•128 








134-140 


142 


147 


-149 


151 


153 


155 








157 162- 


-163 


167 


169 


-172 


174 


177- 








179 186 


190 


192 


-199 


203- 


207 


210 








212-215 


217- 


-219 


222 


-223 


229 


235- 








236 247 


251 


255 


-258 


260 


262 


272 








274-277 


280- 


•281 


285-286 


297-301 








307-310 


313- 


•314 


316 


-317 


321 


325- 








330 333- 


-334 


340 


-342 


348- 


349 


352 








354-358 


370-371 


380 


-385 


387 


•388 








400 405 


408- 


■410 


412 


414- 


416 


421- 








425 430- 


-431 


434 


-435 


437 


439 


441- 








442 445- 


-451 


453 


-454 


456 


459 


451- 








464 468- 


-472 


4 74 


-479 


481 


433- 


-485 








487-491 


496 


499 


-501 


503- 


504 


509- 








513 516- 


-519 


522 


526- 


-527 


529- 


-531 








534 536- 


-540 


542 


547 


-549 


553 


-559 








S66-567 


571 


574 


-577 


579 


582 


584- 








586 589 


593 


595 


-597 


601- 


602 


604 








606-607 


611-613 


615-621 


623 


627- 








629 633 


636- 


-637 


642 


644- 


650 


655 








659-660 


662-665 


667 


669 


674 


-675 








678 682-684 


692 


-696 


698 


700 


706 








708 710 


716-720 


725 


-726 


729-736 








738-739 


743- 


■745 


749 


751 


753 


756 








759 765- 


-766 


768 


770-773 


780 


784- 








786 788- 


•790 


793 


796 


793 


800 


802- 








803 810- 


-811 


814 


817 


819 


826 


828- 








830 832 


834- 


-836 


838 


843 


845- 


-860 








863-864 


866- 


-871 


877-873 


881 


-892 








894-896 


898 


902 


904 


-914 


916 


919- 








925 92 7 


930- 


•932 


935 


-936 


941-942 








945 948-949 


953 


9S5-956 


958 


960- 








962 964 


967 


970 


-971 


973 


975 


977 








985-990 


992-993 


995- 


-996 


999-1002 








1004-1009 1011 


1014 


1017 


-1019 








1022-1023 1025 


1027 


1029 


-1031 








1033-1036 1038 


1041 


1043 


1047 








1050 1053-1054 


1058 


•1059 


1061- 








1062 1064 1068 


1070 


1072 


1078 








1085-1086 1089- 


1091 


1093 


1097 








1106-1107 1110- 


1113 


1115 


-1117 








1122-1123 1125 


1129 


1132 


-1133 








1135-1137 1140- 


1145 


1152 


1158 








1163 1168 1170- 


1174 


1176 


-1178 








1180 1182-1183 


1186 


1195 


1198- 








1200 1202 120S- 


1206 


1211 


1216 








1219-1221 1223- 


1227 


1230 


-1236 








1238-1242 1247 


1252 


1254 


1256 








1258 1261-1262 


12S4- 


-1265 


1269- 








1270 1272-1275 


1277 


1280 


-1284 








12B7-1293 1299- 


1300 


1306 


1308 








1312-1313 1317- 


1320 


1322 


1324- 








1330 133 


3-1335 


1339 


1341 


1343- 








1347 134 


9 1353- 


1357 


1359 


-1361 








136S-1367 1369- 


1370 


1373 


-1374 








1377 1379-1381 


1386- 


•1387 


1394 








1400 1403 14 


09 


1419 


1423 


142S- 








1428 143 


0-1431 


1433- 


•1434 


1437- 








1438 1440-1442 


1446-1448 


1450 
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-1459 


1463- 


•1464 


1468 


1474 


1477- 


1478 


1482- 


-1493 


1496- 


•1S01 


1504 


1512- 


1513 


1S16 


1519 


1524- 


1525 


1527- 


1528 


1538 


1541 


1545- 


-1547 


1553 


1555- 


-1556 


1560 


1575 


1580 


1589 


1591 


1598 


1600- 


-1602 


1606- 


1614 


1620- 


-1621 


1624 


1633. - 


•1632 


1636 


1638- 


1644-1645 


1648- 


-1650 


> 1S58-1660 


1662 


1669- 


,-1679 


1684 


-1688 


1690- 


1700 


1702 


1707- 


-1709 


-1717 


1720 


1723 


1725- 


1737- 


-1738 


1741 


1743- 


1-1749 


1752 


1755 


1760- 


> 1769 


1771 


-1772 


1781- 



1453 
14 70 
1488 
1506 
1521 
1531 
1549 
1565 
1594 
1608 
1626 
1639 
1653 
1670 
1692 
1711 
1727 
1744 
1762 

1784 1786 

4 35-36 44-45 61 68 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621 
632 647 662-664 669 
773 775-777 B02 848 
879 905-907 915 949 
1002 1113 1119 1170 
1236-1237 1241 1275 
1357 1359 1377 1506 
1S53 1591 1600 1613 
1628 1670 1676-1677 
1699 1733 1738 1772 



T458 

-1471 
1490 
1509 

-1522 
1534 

-1550 
1567 
1596 
1611 

-1629 
1641 

-1655 
1675 
1696 
1716 
1733 
1748 



-69 75 82 102 
244 280-281 
45S 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



leukocyte 



Clontech 



LUC00 3 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



25 35-36 43 80 
163 166 188-189 
271 277 280-281 
345 351 372 380 
415-416 430 445 
481 490 499 503 
567 575-576 588 
660 565 734-735 
790 800 832 845 
883 887 905 914 
985 990 992 999 
1038 1050 1055 
1099-1102 1107 
1156 1163 1172 
1214-1215 1217 
1238-1239 
1293 1311 
1345 1355 
1403 1406 1414 
1465 1521 1529 
1547-1548 1582 
1638 1647 1653 
1670 1680-1681 
1724-1725 1731 
1761 



1244 
1320 
1367 



104 126 128 150 
197 210 215 220 
310 317 336-338 
-381 383 387 412 
448 454 456 467 
526 528 546 548 
601 613 615 647 
737 7S9 778 787 
856 859 869 878 
932 934 958 976 
1000 1025 1031 
1068 1074 1088 
1136-1138 1149 
1190 1195 1200 
1226-1227 1235 
1253 1278*1230 
1330 1334-1335 
1386-1387 1394 
1423 1437 1442 
1536 1539 1541 
1620 1626 1631 
1660 1667 1669- 
1696 1704 1715 
1732 1750 1760- 



~ mammary gland 



Invitrogea 



MMG001 



5-8 10 12 14 
33-39 42-43 
71 73-74 79- 
106 108 112 
146 148 150- 
166 170-172 
188-190 194 
222 224 227- 
251 253-254 
271 276-277 



-18 20-21 24 
52 55-58 60- 
80 82 89 98 
123 128 133- 
152 154 158 
174 176 178 
198 201-206 
228 231 233- 
256 261-263 
279-281 284 



-25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-257 
286 288 
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ID NOS: 








Library Name 




















290 


297 299 301 


304 


309- 


312 318 








320- 


321 323-325 


327- 


329 


331-332 








334 


339 341 344 


-34S 


348 


350 356 








359- 


360 362-363 


368 


371 


376 379- 








303 


380 390 393 


-395 


397- 


398 405 








4C8 


412 414-415 


423 


430 


434 -437 








44 1- 


444 448 451 


-455 


462- 


464 474 








476 


479 482 485 


-486 


488 


490 494- 








4S5 


498 503 506 


509- 


•512 


51o -bl7 








519- 


520 522 527 


529 


534 


537-541 








•547 


549 554 557 


562 


572- 


574 587 








589- 


S91 597 602 


607 


618 


623 628- 








629 


632 634-640 


644 


647- 


648 650- 








652 


655 657-658 


660 


665 


667 669- 








672 


674-676 679 


682 


688 


695-696 








706- 


707 710 713 


717 


720 


722-730 








732- 


734 736 738 


743 


747- 


748 750 








755 


759 761 766 


770 


780 


784 786- 








789 


794 803 806 


-807 


809 


814 817- 








822 


827-829 837 


642 


854- 


858 863- 








•864 


866 869-870 


872 


878 


881 889 








893- 


900 904 906 


-907 


911 


916 919 








921- 


923 926 935 


-937 


946 


948-949 








953- 


954 957 960 


-961 


963 


965-9S6 








970 


977-978 984 


-989 


993- 


997 








1000 


-1001 


1005- 


10C6 


1008 


1013- 








1014 


1016- 


-1017 


1023 


1025 


1027 








1032 


-1033 


1036 


1039 


1043 


1045 








1055 


1057- 


-1058 


1063 


1068 


-1075 








1077 


-1078 


1085 


1087 


1089 


-1091 








1095 


-1102 


1107- 


1108 


1112 


-1119 








1121 


-1123 


1131- 


1133 


1136 


-1137 








1139 


-1142 


1144- 


1145 


1148 


-1149 








1153 


1159 


1167 


1170 


1172 


-1173 








1183 


-1185 


1190- 


1192 


1196 


-1199 








1207 


-1208 


1212 


1216- 


-121B 


1222- 








1223 


1225 


1231 


1234 


1240 


-1241 








1247 


1253- 


-1254 


1258- 


-1259 


1261- 








1262 


1270- 


-1280 


1283 


1285 


-1286 








1298 


1307 


1314 


1316- 


-1320 


1323-. 








132S 


1330 


1334- 


1335 


1342 


-1345 








1349 


-1352 


1354- 


1355 


1359 


1369- 








1370 


1377 


1379 


1381 


1383 


-1384 








1389 


1405 


1414 


1419 


1421 


-1423 








1425 


-1426 


1428- 


1429 


1431 


1434- 








1437 


1439 


144 8- 


1449 


1454 


1457 








14S0 


-1464 


1466 


1471 


1480 


-1483 








14B7 


1489-1491 


1493 


1505 


1507 








1512 


1519 


1526- 


1528 


1532 


1534 








1536 


1539 


1542 


1547 


1549 


-1550 








1554 


1561- 


-1562 


1564 


1567 


1572 








1576 


-1579 


1581- 


1532 


1587 


-1588 








1592 


1594 


1596- 


1597 


1601 


-1602 








1607 


-1608 


1610 


1612- 


-1616 


1618 








1621 


-1622 


1625- 


1626 


1631 


1635- . 








1636 


1641 


1643- 


1644 


1647 


1650 








1 co 

l o b<J 


1654- 


-1655 


1657-1658 


ioOU 








1662 


1664- 


•1666 


1669- 


-1671 


1673- . 








1674 


1676- 


•1677 


1680- 


■1685 


1689- 








1692 


1701 


1706 


1713- 


-1715 


1719- 








1720 


1723- 


-1728 


1730- 


-1732 


1738 








1740 


1742- 


-1744 


1746- 


-1747 


1749 








1751 


1753 


1760- 


1762 


176S 


-1768 








1771 


1774 


1776- 


1777 


1779 


1783- 








1784 


1786 










induced neuron 


Strategene 


NTD001 


29 3 


5-36 80 116 


123 


156 


163 181 


cells 






214 


230 280-281 


284- 


•285 


307 321 








330 


340 358 371 


375 


377 


380 382 








422 


424 492 497 


532-533 


542 546 
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549 565 58b S95 612 $45-647 654" 
734 775-778 780 792 799 321 826 
856 858 875 936 953 985 990 992 
1041-1043 10S5 1072 1104 1193- 
1194 1206 1223 1246 1253 1274 
1288-1289 1291 1294 1311 1320 
1349 1359 1412 1423 1485 1620 



retinoid acid 

induced 
neuronal cells 
neuronal cells 



Strategene 



pituitary 
gland 



Strategen 



NTR001 
NTU001 



1623 1645 1684 1705 1715 1 751 
6-8 78 268-269 277 383 431 506 



Clontech 



PIT004 



623 677 731 999-1000 1199 1425- 

1426 1547 

29 65-66 80 82 110 119 146 152 ' 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 488 503 506 510-512 519 537- 
540 572-574 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
1025 1064 1068 1122 1148 .1185 
1219 1226 1234 1246 1271 1283 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



311 314 379 408 419 430 454 10S~5~* 

1095-1096 1272-1273 1312 1320 

1378 1652 1671 1720 1725 1736 
1741 1755 



PLA003 



prostate 



Clontech - 



PRT0O1 



5-8 124 208 277 370 843 906-907 
1280 1317-1319 1359 1609 1621 
1737 

9 46 57 71 107 147 171 177 197' 



201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-S33 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
12S8 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



Xnvitrogen 



REC001 



17-18 29 S3 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 843 849 851 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
11D8-H09 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
13511355 1369 1373 1375 1425- 
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UNA Source 


Hyseq 


otQ ID ND5 : 
















14<£b 14Jo 1469 1474 1477 








118/ AOlb IDS /"J.999 1334 1596 








Itflfl t<rio 1<T2"7 t Cdfl i<:cq -i 
■loiv io^z iO£ / ibi4 lbjo JLbb* 








1009 iDOQ lOD? ID i3"IO I / 1/4 9 








1786 


sal ivarv aland 


p 1 on t" ^<~K 




iu 33 9/ 10.5 110 140 149 152 158 








iijO £i / -/lo z — z*k i ibo 301 308 








jJ.^ -3<il jjJ jsl J34 ioO a 10 437 








4so 4 t i> 4o/ 494 49b 501 bJ5 555 








303-3/0 5/2-3/J 590-551 624 636 








obi '59 762 /64 768 771 783 800 








flno ATc: p/ip flf-c p7o one am mc 
ouj ozo olo oo3 0/9 9V/b 90 / 925 








JOJ iUiO lU^U 1U/3 KJ4b 








1055 1066 1103 1150 1172 1181 








1234 1281-1282 l3flB.19ftO noo 








1315 1320 1333 1336-1337 1346 








1359 1373 1379 1474 1447 144Q 








14/4 1482 1492 1494 1498 1511 








1523-1524 1537 1554 1596 1626- 








1D« / ID JD I09£ X033 1D3D lOOD 








1671-1672 1691-1692 


salivary gland 


Clontech 


SALS03 


13D OZO i^i^J ilOJ'llD^ 


skin 




Of DUU X 


11UU 


fibroblast 








skin 




or BQ02 


252 736 1025 1253 


fibroblast 


, 








ATCC 


SFB003 


709 1119 1350" 1631 1653 










small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 286 288 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 425-427 430 








4.34-435 445 452 454 478 S03 516 








519 521 523 543 547 549 555 559 








563 S69-570 585 S92 604 611 626 








628-629 632. 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 866 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 








1403 1407 1423 1428 146S 1498 








1501 1521 1550 1556 1585 1597 








1636 153B-1539 1645 1653 1656 








lbfaz lb71 lo /b 16B4 1691-1692 








1/U4 1/11 1/17 1719 1722 1725- 








1 IOC 1700 1 *7n 1 71A IT^T 1 "iji ji 

X/«so / J4 1/4.J-1744 








1 /o2 1/67 1780 1785 


skeletal 


Clontech. 




-Lo iU-il aZ 04 1U1 Ho ±J4 14 0 








151 153 166 225-226 258 274 277 








289 329 412 414 424 4AO AO 








459 470 488 503-504 537-540 647 








660 673-675 715 773 780 786 830 








o ft p" (m or a ft /•* a ft ft ft ft /> /\ *s 

905 922 950 963 982 990 992 1020 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1509 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


clontech 


SKM8Q4 


235-236 


muscle 








spinal cord 


Clontech 


SPC001 


4 9 11 17 30-31 35-36 43 46 60 | 
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Hyseq 
Library Name 



SEQ ID NOS: 



SPLcOl 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300 
317 372 379 387 392 
430 433 448 467 473 
509 513 S19 S24 52S 
547 549 551. 559 567 
607 516-617 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
85S 858 861 864 871- 
898 906-908 917 919 
944 970 985 990 992- 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 
215 
302 
419 
487 
537- 
569- 
637 
673 
728- 
782 
847 
-872 
924 
-993 
1072 
1103 
1151 
1225 
1312 
1353 
1400 
1443 
1501 
1549 
1614 
1651 
1751 



139 157 
229 256 
304 315 
426-427 
439 506 
540 543 
570 5S3 
649-650 
679 6ei- 
729 734 
789 791 
849 854- 
875 884 
934 942 
998 1013 
1075 
1109 
1170 
1241 
1320 
-1354 
1406- 
1448 
1508 
1565 
1625 
1652 
1755 



STO001 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
13S5-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



10 15-16 61 68-69 100 117 149 

197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



THA002 



THM001 



9 11 25 85 87 112 137 146 180 

190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 863 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



44-45 54 57-58 62-64 79 104 123 
126 134 153 193 212-213 218 242- 
243 258 274 277 279 297 301 307 
327 330 333 342 351 358 371 410 
430 445 465-466 468 471 483 487 
493 503 506 509 517 S26 535 537- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



thymus 



Clontech 



THticQ 2 



SBQ ID MOS: 



"540 54 tf" 
591 604 
649 656 
728 735 739 
775-777 780 
824 826 



660 



548 554 567 584 586 590- 
612 621 638-640 645-647 
655 670 698 710 720 
746 759 762 766-767 
734-785 800 802 809 
828 845 851 858-859 864 
866 870-871 87B 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1115- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1277 1282 1320 1329 1349 
1369 1383-1384 1417 1419 
1425-1427 1448 1477 1488 
1536 1554 1620 1644 1646 
1654-1655 1661-1652 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



1271 
1367 
1423 
1493 
1649 



5-9 15-21 25 33 35-3$ 43-45 48 

50-51 54-5S 60 75 S3 87 89 93 
93-100 102 105 112 117 13S-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-2S2 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 



464-467 
497 500 
524 526 
554-555 



470 472 
504 506 
530-531 
565-566 



575-577 586-587 
612 630-632 634 
660 666-667 669 
700 703 708 720 



474-476 483 488 
513 516 519-520 
534 537-540 549 
569-570 572-573 
595 603-604 606 
636 647 650 657- 
573-675 678 698 
725-726 731 738- 



739 743-744 750-753 757 759 763 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-891 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 938-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1097 1109 
1140-1141 
1175-1177 
1211 1216 



1079 1089 
1122 1131 
1163 1172 
1198 1206 
1227 1234-1243 
1271 1280-1281 
1317-1320 1322 
1327 1330 1334-1335 
1350-1351 1355 1357 
1374 1377-1379 1386 
1397 1400 1402 
1423 1425-1427 
1474 1477 1483 
1506 1525 1536 
1594 1598-1600 
1621 1623 1625 
1644 1647 1649 
1662-1563 1671 
1686-1688 1693 1705 
1711 1717-1718 1726-1727 
1733 1737-1738 1743-1745 
1761 1771-1772 1779 1786 



1074 
1117 
1145 
1196 
1223 
1267 
1308 



1392 
1417 
1466 
1504 
1566 
1614 
1641 
1658 
1681 



1114- 
1144- 
1186 
1220 
1261-1262 
1284 1290 
1324-1325 
1339 1346 
1360 1370 
1389-1390 
1406-1407 
1440-1441 
1493 1498 
1545 1549 
1608 1611 
1632 1639 
1653-1656 
1673 1678- 
1707 
1731- 
1758- 
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Tissue Origin | RNA Source 



thyroid gland j Clontech 



Hyaeq 
Library Name 



THH001 



trachea" 



Clontech 



TRC001 



SEQ ID NOS: 



256 258 
280-281 
302 309- 
341-342 



4 9-10 20-21 37-39 48 50-51 54-" 
57 60-61 65-66 71 83 94-96 98 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152 
153 155-158 163-164 168-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 
262 265-266 268-269 277 
284-286 288-289 298-299 
311 317 321 326 332 335 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 564 569-S70 575-576 588 594- 
595 601-602 604 606 610 612 615 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 B04 823- 
824 826 828 833 838 841-845 847 
849 857-860 867 874-875 878 B8C- 
881 887-8e8 890-892 894-895 398 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 9S3 957 961 963-964 966 978- 
979 981-982 937 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1055 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 14281436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1S34 
1536-1537 1548 1550 1S53 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1S36 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 
9 29-31 46 48 87 104 107 110 135 



158 222 262 266 286 301 318 331 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



352 372 377 384 414 424 445-44<T 
454 472 474 431 496 560 579 588 
593 597 607 512 626 681 702 719 
810 859 866 B78 894-895 912 916 
922 932 935 1046 1075 1080 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1569 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



Clontech 



CTTR001 



17 19 25 41 46 57-58 61 89 104 "' 
108 139 152 174 198 200-201 206 
263-265 274 290 387 408 420 438 
446 448 452 473 491 493 499 503 
506 S13 519 522 526 530 542-543 
560 601 610 632 6S9 665 720 751 
773 780 833 845 8S7 872 877 912 
929 934 937 996 1009-1011 1018 
1050 1075 1107 1124 1170 1219 
1258 1279 1287 1310 1320 1323 
1343-1344 1375 1437 1451-1452 
1478 1481 1498 1519 1521 1536 
1552 1579 1597 1602 1606 1620 
1626-1627 1649 1652 1661 1670 
1719 1722-1723 



TRADOCS: 14 1 61 91 . !(%CQN01 LDOC) 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


| 5 
IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PROH14 protein 
sequence . 


J 1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943. 


2389 


99 


3 

4 
5 


AF113136 

AF017806 
X02761 


Homo sapiens 

Mus rausculus 
Homo sapiens 


IL-1 receptor-associated- 
kinase- M; IRAK-M 
Zn-15 transcription factor 
fibronectin precursor 


3043 

6351 
j 10535 


100 

f 77 
( 98 


6 
8 
9 


X02761 
X02761 
AJ011679 


Homo sapiens 
Homo sapiens" 
Homo sapiens 


fibronectin precursor 
fibronectin precursor 
Rab6 GTPase activating 
protein, GAPCenA 


j 8990 
j 12564 
T 5251 


j 89 
99 
99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
KP104l5-encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466Nl.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) J 


896 


100 


13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


T1894 


98 


14 
IS 


AF213457 
A^233^453 


Homo 
sapiens 
Homo sapiens 


triggering receptor expressed 

on myeloid ceils 2 

RACK- like protein PRKCBP1 


I 1238 


| 100 


17 
18 


AF201303 
AF064205 


Homo sapiens 
Homo sapiens 


dnrr oribeta- binding protein 
RIP60 

dynactin l pi50 isoform 


3124 
[3130 


j 99 
98 


19 


U00059 


Saccharomyce 
s cerevisiae 


Yhrl2iwp " 


[6377 
174 


ML00 
26 


20 ■ ■ 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 
"22 


AB0329O3 


Homo sapiens 


guano sine monophosphate 
reductase isolog 


1485 


99 


"23 


AF140507 


nmiiu bapieuS 


Ca2+/caimodul in- dependent 
protein kinase kinase beta 


3083 


99 




AF140507 


Homo sapiens 


Ca2+/ calmodulin -dependent 
protein kinase kinase beta 


2300 , 


99 


24 


AJ289131 


Homo sapiens 


cnondroitin 4-o- 
sulfotransf erase 


2211 H 


99 


25 

26 
27 


U334S0 

Y44488 
1743 701 


Homo 
sapiens 
Homo sapiens 
Homo sapiens 


DNA-directed RNA polymerase [ 
I, largest subunit | 
ACRP30R2 variant protein. j 


8777 

1387 | 


98 
100 


~2in 

29 

30 * 


U02032 
Y41324 


Homo sapiens 
Homo sapiens " 


ribosomai protein L23a [* 
ribosomal protein L23a | 
Human secreted protein | 
encoded by gene 17 clone 

"Wrll II. 1 


791 ~|" 
767 | 
1083 


100 

97 

99 




W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . | 


715 — r 


90 - - 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation j 
system protein 2. | 


631 


82 


32 


AF231917 


Homo sapiens 


long-chain 2-hydroxy acid t 
oxidase HA0X2 | 


1811 


100 


33 
34 


Z29481 
AB001451 


Homo sapiens 


3-hydroxyanthranilic acid j 
di oxygenase | 
ocx f 


1507 


99 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


2869 
1667 


100 
99 


36 

37 " " 


Y00644 


Homo sapiens ~ 


precursor polypeptide (AA -34 
to 287) j 


1104 


98 




Y78795 


iomo sapiens ~ 


Human antizuai-2 (AZ-2) amino 
acid sequence, f 


3586 


78 


38 


£78795 


Homo sapiens ™ 


riuman antizuai-2 (AZ-2) amino T* 
icid sequence . 


1726 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMB3R . 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (A2-2) amino 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M-pbase phosphoprotein-l 


3747 


100 


41 


Y4 2 7S0 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) . 


795 


100 


42 


AP282S26 


Homo sapiens 


latexin 


1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID MO: 6231. 


384 


94 


44 


U19617 


Mus mus cuius 


£l£-l 


2724 


83 


45 


U19617 


Mus mus cuius 


Elf-1 


2062 


86 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROOTY-l protein, SEQ 
ID NO:24. 


1737 


" 99 


49- 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


Rattus 
norvegicus 


rab- related GTP- binding 
protein 


1089 


96 


53 


L31733 


Mus mus cuius 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


"56 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D79994 


Homo sapiens 


similar co ankyrin of 
Chroma ti urn vinosum. 


6089 


99 


59 


D79994 


Homo sapiens 


similar co ankyrin of 
Chroraatium vinosum. 


4dl4 


"91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031O69 


Homo sapiens 


protein containing CXXC 
domain 1 


1390 


100 


62 


Y66460 


Homo 
sapiens 


Membrane -bound protein 
PR0783. 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 | 


"64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1308 1 clone 
secreted protein. 


157 


30 


67 


AJ245738 


Homo sapiens 


claudin-15 


1206 


100 


68 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


41B3 


87 


69 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4906 


86 ~~ 


70 


Z82059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AP224278 


Homo sapiens 


PMEPAi protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


1207 


100 


75 


AF188622 


Mus mus cuius 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


76 


AE000406 


Escherichia 
coli 


putative DNA topoi some rase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 

« - 


AL136538 


schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


AF129756 


Homo sapiens 


G4 


1554 


99 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 



80 



90 



ACCESSION 
NUMBER 



AL096768 



AL096768 



Homo sapiens 



Homo sapiens 



thai i ana 



DESCRIPTION 



dJ6S8B15.2 ~ ' 

(phosphatidylserine 
decarboxylase (FSSC, EC 
4.1.1.65)} 



CU858B16.2" ~ 

(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65)) 




contains similarity to pre- 
mRNA splicing 
factor-gene_id:MRB17.2 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



2033 



1220 



634 



100 



96 



IT 



92 
93 



Mus musculus 



liomeodo 



Mus musculus 



A61971 
Y99365" 



unidentified 



iomain protei n 
phtf protein 



5S4~ 



MCSP 

Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 86. 



619 



11676 



61 



99 



95 



Homo sapiens 



3890 



Homo sapiens 



Human signal peptide" 
containing protein HSPP-8 

SEQ ID NO: 8. 

protein kinase WNKl 



1031 



100 



100 



Rattus 
norvegicus 



2428 



95 



~97~ 
"98" 



Y92S13 ' 
AL021366 



Rattus 
norvegicus 



protein kinase WNKl 



1961 



Homo sapiens 
Homo sapiens 



Human OXRE-10. 



1626 



SuT 



100 



99 
100 



101 



102 

ior 



104 
10S 



106 

"10T 



AC005733 



Y95293 
AL118S01 



Homo sapiens 



CICK0721Q.3 (Kinesin related 
protein) 



3423 



Homo sapiens 



AJ006267 



AF100753 



Homo sapiens 



R33083_l 
Human GEF 1 containing NEK- like 
kinase substrate sGNX. 



1974 
4092 



Homo sapiens 
Homo sapiens 



du-H9lN16.i (A novel protein 
'translation of the cDNA 
DKFZp566A0946, Em: AL050069) ) 



1509 



ClpX-like protein 



ABQ15982 
AF151074 



M35522 



Homo sapiens 



ancient ubiquitous 46 kDa 
protein AUP1 



3233 



2042 



Homo sapiens 



serine/threonine kinase 



Canis 
familiar is 
Homo sapiens" 



HSPC24Q 

GTP-bindmg protein (rab7) 



4718 



354 



100 



99 



99 



100 



100 



96 



100 



64 



50 



NTII-i nerve protein, 
facilitates regeneration of 
nerve cells . 



2337 



93 



Homo sapiens 



NADH- cytochrome 05 reductase" 
isoform 



1290 



93 



109 
Tl0~ 



Homo sapie ns 
Homo sapiens 



F23269 2 



3369 



99 



111 
112 



X52425 
Y41686 



Homo sapiens 
Homo 
sapiens 
Homo sapiens 



RAN binding protein 16 



interleukin 4 receptor 



3285 



4496 



00 



Human PRQ274 protein 
sequence . 



2285 



100 



114 



Tir 



iir 



Y71071 



Homo sapiens 



AL04954 8." 



Homo sapiens 



AF189817 
W30891 



Mus musculus 
Homo """" "*~ 



Mitogen activating protein 
kinase BRKl . 

Human membrane transport 



1991 



protein, MTRP-16 . 
CLO-398G3.1 (ortholog of rat 



1190 



CPG2) 



3497 



evectm-2 



Human cytostatin III proteinT 



124 



715 



100 



99 



99 



90 
99** 



145 
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TABLE 2 



SEQ 
ID 
NO: 


J ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WAT2RMAN 
SCORE 


* 

IDENTITY 


118 


AF116618 


sapiens 
Homo sapiens 


PRO1038 






119 


Y08915 


Homo sapiens 


alpha 4 protein 


1459 
174 8 


100 
100 


12C 


AF098070 


Drosophila 
melanogaster 


Li si homo log 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p8Q subunic 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


93 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


Leishmania 
major 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu 


935 


36 


127 


Z63220 


Caenorhabd.it 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo sapiens 


Human zsig4 4 protein. 


463 


100 


130 


AF115391 


Lactobacillu 
s sakei 


ribokinaoe RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21-Glucamic Acid-Rich Protein 


9l£ 


87 


133 


W52B11 


Homo sapiens 


Human Dbi/acbp -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


3230 


100 


135 


M69181 


Homo sapiens 


non- muscle myosin B 


189 


20 


136 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83 . 


480 


100 


137 


W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81. 


855 


99 


~T38 


AL033520 


Homo sapiens 


CU349A12.1 (similar to 
KIAA0701 protein) 


424 


39 


139 
140 


AF020261 " 
X70394 


Santalum 
album 

Homo sapiens 


proline rich protein 
zinc finger protein 


119 


30 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


1634 
936 


100 
100 


142 


Z68493 


Caenorhabdit 
is elegans 


predicted using Gene finder 


365 


42 


143 


AB018107 


Arabidopsis 
thai i ana 


ADP-ribosylation factor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


F3F19.18 


547 


31 


14 8 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13. 


1494 


98 


149 


AF0S6490 


Homo sapiens 


cAMP -specific 
phosphodiesterase 8A 


3 710 


99 


150 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


785 


99 


151 


U10397 


saccharomyce 
s cerevisiae 


Yhrl46vp ~~ 


515 


53 


152 


X73473 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ3 82110. 5.1 (novel protein 


2034 


99 



146 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


ft 

IDENTITY 








similar to arginyl-tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rao28 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


14 71 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32. 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor-2 (PKI-2) . 


383 


100 


ISO 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 


100 


1S1 


W54040 


Homo sapiens 


Human int err eron- inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ413H6.l.l (hamster 
Androgen-dependent Expressed 
Protein like potative 
protein) (isoform 1) 


1357 


100 


153 


AF12553S 


Homo sapiens 


pp21 homolog 


193 


45 | 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


463 


97" 


165 


AJ250839 


Homo sapiens 


serine/ threonine protein 
kinase 


"1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71 . 


1084 


100 


169 " 


AP214731 


Homo sapiens 


ATP -dependent RNA helicase 


4402 


100 


170 


AE000871 


Methanobacte 

rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 ~ 


Homo sapiens 


Human secreted protein 
encoded by gene No. lis. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


"779 


100 


174 


D43949 


Homo sapiens 


This gene is novel. 


3202 


100 


175 


Y07923 


Homo sapiens 


GT? -binding protein 


1205 


100 


176 


W90338 - 


Homo 
sapiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel-related 
molecule HCRM-3 . 


1122 


100 


178 


Y41674 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Eomo sapiens 


krueppel-Iike zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq B-chain precursor 


1240 


100 


181 


U57344 ' 


Mus musculus 


Meis3 


1813 


69 


183 


U57344 


Mus musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


1070 


B6 


185 


AF033120 


Homo sapiens 


pS3 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


160S 


82 


"IF? 


W75058 


Homo sapiens 


Hunan secreted protein 
encoded by gene 2 clone 
HLDBG33. 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-1, 
protein sequence. 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


~19l 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 



147 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 


194 


AF084259 


Mus musculus 


bromodoma in - containing 
protein BP75 


693 


54 


195 


Y00752 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W9S349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70 7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


19B 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236 i. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 1 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6142. 


558 


99 


203 


X1388S 


Nicotiana 
) tabacum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 Jed accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60 . 


1318 


100 


208 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65 . 


936 


98 


209 


AL121889 


Homo sapiens 


dJ1076El7.l (KIAA0823 protein 
(continues in AL023 803)) 


6"94 


54 


210 
211 


AF226732 
X66295 


Homo sapiens 
Mus rausculus 


NPD007 

Clg C chain 


1345 


76 


212 


Z29328 


Homo sapiens 


Ubiqu^ tin- conjugating enzyme 
UbcH2 


970 
966 


73 
100 


~213 
214 


Z29328 
AJ002030 


Homo sapiens" 
Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


542 


™98 


215 


X70649 


Homo sapiens 


progresterone binding orotein 
member of DEAD box protein 
family 


1163 
3933 


100 
100 


216 
217 


AF250558 
AL021453 


Homo sapiens 
Homo sapiens 


claudin-2 

dJB2lDll.i (PUTATIVE protein)' 


1169 


99 


218 


Y08565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


259 
3331 


100 
99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


"100 


220 


AL035521 


Arabidopsis 
thaliana 


putative protein 


315 


42 


221 


AL03 17fifi 


Scnizosaccha 
romyces 
pombe j 


putative proline- trna 
synthetase 


811 


41 


222 
"223 


AL109736 
X52493 


Schizosaccha 

romyces 

pombe 

Glycine max 


WD repeat protein 

dna directed RNA polymerase 


626 
136 


40 

23 


224 
225 
226 


AL0356S9 
AB032401 
AB032401 


Homo sapiens 
Mu.3 musculus 
Mus musculus 


dJ979Nl.l (dJ979Nl.l) 
mmDj4 


5199 
1761 


98 
92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J1007 " 


1988 
112 


92 
26 


228 
229 


X83S02 
AF143 723 


Saccharomyce 
5> i.ca csvjLsxae 
Homo sapiens 


jToot ~ 

heat shock protein HSP60 


79 


25 


230 

231 " " 


Y66677 
AB027466 


Homo 
sapiens 
Homo sapiens 


Membrane -bound protein 

PR0828. 

spondin 2 


2557 
982 


99 
100 


232 
233 


W95634 
K00365 


Homo 
sapiens 
Homo sapiens 


Homo sapiens secreted 
protein. 

Human cyclin Bl . 


1756 
1391 


99 
100 


234" " 


Y53762 


Homo sapiens 


A GTP- binding polypeptide 


2218 
1017 


100 



148 
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TABLE 2 



. sso 

NO: 

235 
i i x 


ACCESSION 
NUMBER 

Z50749 


SPECIES 
rtooo sapiens 


DESCRIPTION 

designated RAQ. 
yeast sds22 homolog 


SMITH- 
WATERMAN 
SCORE 

1800 


IDENTITY 
100 


<oo 
237 
238 


Z50749 

AB026491 

AJ270205 


Homo sapiens 
Komo sapiens 
Entodinium 
caudatum 


yeast sds22 homolog ~ 

PICK1 

putative 

phosphatidylinositol-4 - 
phosphate 5-kinase 


1754 
2137 
114 


98 

100 

37 


J 239 - 


AB030189 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


"93 


240 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 
242 


W56538 
AF155107 


Homo sapiens 
" Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 
NY-RSN-37 antigen 


3436 
996 


99 
99 


243 
244 


AF155107 
AL031320 


Homo sapiens 
Homo sapiens 


NY- REN- 3 7 antigen 
chJ20N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


1005 
763 


100 
99 


245 
24 6 


U37026 


Rattus 
norvegicus 




162 


30 


24 7 


AL078599 


Homo sapiens 


C1J991C6.1 (novel protein 

similar to C. elegans 
FS5A12.9 (Tr:P91086)> 


2391 


98 


248 


U32274 


Saccharomyce 
s cerevisiae 


Ydr3 86wp; CAI: 0.12 


191 


37 


249 


Y41719 
AB029434 


Homo 
sapiens 
Homo sapiens 


Human PR0864 protein 
sequence . 
ghrelin precursor 


1079 


100 


250 


X97831 


Rattus 

norvecri ru c 


cam! tine/acylcarni tine 
carrier protein 


611 
246 


100 ~" 
38 


251 


W80993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF . 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP0263 2. 


1876 


100 


253 
"254 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (H2BGM49) . 


765 


100 


255 


AL^54533 
AF233322 


Leishtnania 
major 

Mus musculus 


possible adenylate kinase 
zinc transporter like 2 


265 


34 


256 
"257 


Y78113 


Homo sapiens" 


Human cytokine signal 
regulator CKSR-l SEQ ID 
N0:1. 


1916 
2247 


95 
99 


258 


AL035539 


Arabidopsis 
thaliana 


putative amino acid transport 
protein 


390 


27 


"259 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 


1171 


100 


260 


AL035^89 


Homo sapiens - ~ 


dui87Jll.i (novel protein 

similar to protein kinase C 
inhibitors) 


974 


100 


261 
262 


AEQ009G9 

AL050131 
AF019661 


Methanobacte T " 
rium 

therraoautotr 
ophicum 
Homo sapiens 
Mus musculus 


serine/threonine protein 
kinase related protein 

hypothetical protein 

zeta proteasome chain; P5MA5 


363 
626 


16 
100 


263 

a 484 

265 


AL035593 
AL022318 ' " 

&F20S940 


Kotno sapiens "~ 
Somo sapiens 

Homo sapiens 


uuj wu ° • -L \ novel procexn) 
DK150C2.3 (PUTATIVE novel 
protein similar to AP03EC1 ) 
sndomucin 


1214 

821 

1072 

1289 


100 
100 
100 

100 


266 

267" " , 


&L023583 
\L03454 8 1 


iomo sapiens 
lomo sapiens 

] 
I 
1 


da500Ll4.1 (novel protein) 

dJH03G7.3 (novel protein 

kinase domains containing 
srotein similar to 
ahosphoprotein C8FW) 


iS9 
L888 


LOO 
99 
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T SEQ 
ID 
NO: 
268 


ACCESSION 
NUMBER 

AP161470 


~| SPECIES 
I Homo sapiens 


DESCRIPTION 

HSPC121 


SMITH- 

SCORE 
1884 


IDENTITY 
"98 


269 
270 

271 


AF161470 
X90763 

M-P ZU f OUU 


1 Homo sapiens 

I Homo 

1 sapiens 

| Homo sapiens 


HSPC121 ~ — 

HHa5 aair keratin type I 
intermediate filament 
ethanolamine kinase 


1232 
2190 


~ 96 
99 


272 
273 


M32334 
AF161483 


j Homo sapiens 
| Homo sapiens 


intercellular adhesion ' — 
molecule 2 

HSPC134 — 


1952 
1436 


100 
100 


274 
276 


Y53C52 


Homo sapiens 


Human secreted protein clone 
df2 02_3 protein sequence SEQ 
ID N0:110. 


663 
587 


61 
100 


277 


Y77576 


1 Homo sapiens 


~ Human cytoskeletal protein 

(HCYT) (clone 2195418). 


762 


1C0 


278 


AF077042 


Homo sapiens 


3 OS ribosornal protein S7 
homolog 


1269 


100 


279 


Y94907 


Homo sapiens 


Human secreted protein clone 
cal06_i9x protein sequence 
SEQ ID NO: 20. 


1619 


98 


280 


loo / oc 


Homo sapiens 


Amino acid sequence of a 
human phosphoryl s t ion 
effector PHSP-20. 


2801 


99 


281 


Z75134 


Canis 
| familiar is 


rod transducin 


1816 


100 


282 
"283 
~284 


Z75134 

AF249873 
ALO50007 
AF201931 ' 


Canis 
j familiar is 
[ Homo sapiens ' 
|_Homo sapiens^ 


rod transducin 

muscle -specific protein 
hypothetical protein 


1718 

1395 
405 


96 

100 | 


285 

AO o 

287 
288 


AF156102 
Y35897 

r AL050143 


pHomo sapiens 
[Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


~DC1 ■ — 

ELL complex EAP30 subunit 
Extended human secreted ' 
protein sequence, SEQ ID NO 
146. 

HEM 4 5 - 

hypothetical protein 


1859 
1318 
1250 

923 


-99 

99 ! 
99 

100 


289 
290 

292 ""' 


AJ011098 
Y66724 

AF034801 J 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


Membrane -bound protein 

PR083S. j 

iiprin-alpha4 " " — 


598 
574 
2321 

2565 


100 
100 
100 

98 ' " 


293 
294 


AF034001 "J 
"AL049851 f 


Homo sapiens 
Homo sapiens 


liprm-alpha4 
(isoforra 1) ) 


2590 
1738 


100 
100 


295 


Y73348 | 
L11672 


Homo sapiens 
Homo sapiens 


htrm clone 83 96S1 protein 

sequence . 

zinc finger protein ~ 


1245 


99 


296 
297 


AJj035423 I 


Homo sapiens 


OJ20I3.1 (brain, mitochondrial 
carrier protein-l <BMCP1) > 


1694 
1024 


44 

79 


298 


AF198532 
AF161417" T 


Homo sapiens 
Homo sapiens 


lVTJlDJlOld Snh^nOAT" h-in^l nn 

factor-l 

HSPC299 " 


2173 


100 


299 

300 " 


Arli3J.41 1 
inr-iq-i h 


Homo sapiens 


breast cancer metastasis^ 

suppressor 1 


1147 
1236 


85 
99 


301 


Uxo jy / i 


Rattus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 " " 


30 


302 


AF036145 
Z82022 1 


tfomo sapiens " 
iomo sapiens 


meningioma - exores cpn — - 
5 -Passed antigen 

GicNac-l-P transferase s ~" 


3458 


100 


303 
"304 ; 


AF269232 (1 


was musculus 1 


butyrophiiin-iiJce protein 

BUTR- 1 


2067 
271 


99 
50 


305 " "i 


&J222644 j 
| 1 


•vrabidopsis 
:haliana 


dSparaginyi-tRNA synthetase " 


559 


iO 


~30T ; 


£P054180 I 
I t 

*J272079 1 I 


iomo ~] 
sapiens 

iomo sapiens 1 


lematopoiecac cell derived ! 

rinc finger protein 


351 


79 


308 ) 

309 7 


(T44485 Tl 
UJ131891 fF 


lomo j 
iapiens j 
omo sapiens "1 


\P0BEC-1 stimulating protein : 

{uman GPRW receptor 
jolypeptide. 

jna polymerase mu ; 


3056 : 
1721 ] 

*598 """3 


LOO 

LOO 

.00 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


IDENTITY 


310 


" AF29333S 


Homo sapiens 


p30 DBC 


1 24 8 


92 


311 


"AF176525 


Mus musculus 


F-box protein FBl>12 




93 


312 


X57802 




^ ********* * w 3 4QUUJUU X AmIIL 

chain 


acq 


81 


313 


Z3S715 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 




- tot 


100 


315 


AF2Q8068 


Homo sapiens 




304 6 


100 


316 




' Homo 
sapiens 


PR01013. 


1166 


100 


317 


Y29666 


Homo sapiens 


ciuiuon rvao ^/l ULci.Il t\J\£f . 


1253 


98 


318 


AJ387747 


Homo sapiens 


siaiin 


2614 


99 


319 


AF151362 






224 


40 


320 


Y63773 


Homo sapiens 


Amino acid sequence of a 
human phosphory 1 a t i on 
ei rector fhsp-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putative THl protein 


3013 


100 


322 


■rvDUfiU □ ± £ 


Homo sapiens 


procein kinase PAK5 


3792 


99 


323 ■ 


Y95013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID N0:6S. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


197^ - 


100 


325 


VOX QAA 


Homo sapiens 


Human secreted protein clone 
bfl57^l6 protein sequence 
SEQ ID NO: 94. 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein -7 sequence . 


6728 


99 . 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 


326 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 
sapiens] 
>R65207 
R65207 02- 
MAR-1995 27- 
AUG- 1993 
Human 

stromalin-i . 
{Homo 

can "> on a 


nuclear protein SA-l 


6492 


99 


331 


AL008583 




dJ327J16.3 (supported by 
GENS CAN, FGENES and GENEWISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ27166£ 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Kus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the • 154 
Eimeria tenella gene etlOO ) 


26 


336 


Y85564 


Homo sapiens 


Human horaologue of UNC-53 
(Hs -UNC-53 / 1 ) sequence . 


3386 


97 


337 


Y8SS64 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) - sequence. 


2602 


94 


338 


Y85564 


nuuiw sapiens 


Human homologue of UNC-53 
(Hs -UNC-53/1) sequence . 


3447 


98 


339 


266561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor- 3 


2761 


99 


341 


GQ1946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain | 439 • 


84 
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TABLE 2 



SEQ 
ID 
NO: 
387 


ACCESSION 
NUMBER 

AF208845 


SP2CIBS 
Hocno sapiens 


BM-003 


SMITH- 

UKTDOMMI 

«H1 CUUVWi 

SCORE 


IDENTITY 


389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


99 
76 


390 


AF182404 


" Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


Y85564 


Homo sapiens 


Human horoologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


I6l£ 


62 


395 


AF181721 


Homo sapiens 


R02S 


2254 


100 


396 


Y69197 - 


Homo sapiens 


human betalv-spectrin 
protein. 




98 


397 


U48238 


Mus musculus 


lAiiys* yjL\J\,KZJ*H ilcUCO*Q4 


749 


60 


398 


AL390137 


Homo sapiens 


hvoofche t ical nrnt-pin 


263 


SI 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


nu lepedi. pruCcln 


447 


27 


401 


AC004B59 


Homo sapiens 


3imilar to 2-oxogiutarate 
dehydrogenase ; similar to 
Oq??i ft / om ♦ «t i c*5 a a \ 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


10246' 


62 


403 


AL133288 


nuuiu bdpjLens 


cub/iu/.i (similar to 
u.rauianoyaster Lbb^oo 
protein) 


761 


100 


404 


268753 


is elegans 


AL3ii 0 .JO 


888 


48 


405 


278013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 

Parihpri w - r-o "I j *- t mnr 

suppressor 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain -2 


1196 


97 


407 


AP155106 


Homo sapiens 


NY- REN- 3 6 anr i am 


1168 


100 


408 


YS7945 


Homo sapiens 


HTMPN-69. 


153 8 


99 


409 
410 


218361 
AF249744 


Ovis aries 
Homo sapiens 


trichohyalin 
RhoGEF 


184 


30 i 


411 


AF176529 


Mus musculus 


F^box protein FBX13 


2733 
2072 


100 
94 


412 


AF210842 


Homo sapiens 


HARP 


4 680 


100 


413 


AL0316S3 


Homo sapiens 


dJ310O13.7 (novel protein 
3) 


776 


98 


414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Homo sapiens 


3 -methylcrotonyl-CoA 
carboxylase biotin-containiro 
subunit 


2961 


"99 


416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


417 


AL160493 


Leishmania 
major 


possible t26fl7.21 


23 9 


T5 


418 
419 


Y08100 
U15131 


homo sapiens 
Homo sapiens 


Human PR0331 protein. 
pl26 


330 




420 


AP117946" - 


Homo sapiens 


Link guanine nucleotide j 
exchange factor II 


2228 
2363 


OH 

100 




AF190635 


Drosophila 
melanogaster 


ankyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein-2 


1962 


100 


423 


AL13753 0 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son -a 


7269 


100 


"425 


AB027249 


Homo sapiens 


*IAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 \ 
precursor j 


1084 


55 
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SEQ 
ID 
NO: 


ACCESS ION 
NUMBER 


" ' SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


IDENTITY 


427 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 
429 


AE003683 
YQ7B2B 


Drcsophila 
melanogaster 
Homo sapiens 


CG8312 gene product 
RING finger protein 


149 


"'29 


430 
431 


AF09*897 
Q41387 


" Drosophila 
melanogaster 
Homo sapiens 


pushover """" 
Gu protein 


2201 
4442 

4021 


99 
" 47 

99 


432 
433 


AF023674 
AF146760 


Homo sapiens 

Homo 

sapiens 


nephrocystin 

septin 2-like cell division 
control protein 


3783 
2284 


100 
100 


434 


AB006697 


Arabidopsis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP . 


1704 


100 


438 
439 


AB040672 
AF105228 


Homo sapiens 
Bos taurus 


UDP-GalNAc: polypeptide T- 
acetvlqalactosaminvl hranc f*»*-a 
se 

tuftelin 


1075 


63 


440 
441 


R064 63 
X14971 


Homo sapiens 
Mus musculus 


lierived protein of clone 
ICA13 (ATCC 40S53) . 
alpha-adaptin (A) (AA 1-977; 


285 
3073 


33 ~ 1 
95 


442 


X53773 


Rattus 
norvegicus 


alpha-c large chain (AA 1- 
938) 


4897 
3979 


98 
81 


443 


Y66689 


Homo 
sapiens 


Membrane -bound protein 
PR01136. 


3299 


99 


444 

445 


AC067754 
AF229032 


Arabidopsis 
thaliana 
Mus musculus 


unknown protein; 20348-23707 
piL 


114 


33 


446 
447 


AF05603S 
AF132484 


Rattus ~~ 
norvegicus 
Mus musculus 




2077 
2662 


93 
85 


448 
449 


W89024 
AF161445 


Homo sapiens 
Homo sapiens 


Polypeptide fragment encoded 

by gene 156 . 

HSPC327 1 ■ 


4 78 
528 


51 
45 


450 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


i£o£ 
951 


100 
49 


451 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


32 


452 


W85727 


Homo 
sapiens 


Novel protein (Clone 

BM46_10) . 


2799 


99 


453 


Y53^29 


Homo sapiens 


A bone marrow secreted " 
protein designated BMS115 


2810 


100 


454 
455 


D87438 
AF240468 


Homo ~ 
sapiens 
Homo sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 
nicastrin 


4069 
3687 


100 
100 


456 
457 


Z1500S 
M59216 


Homo sapiens 

Homo 

sapiens 


CENP-E ■ ■ 

receptor beta-1 subunit 


13305 
2477 


99 
100 


458 
459 


Y73467 


Homo sapiens 


Human secreted protein clone 
* _- L t'*- wLem bequence oby 
ID NO:156. 


966 ■ " - 


100 




W67824 


Homo sapiens 


Hutnan secreted protein 
encoded by gene 13 clone 
HSLFM29 . 


535 


100 


460 


AF163151 
D874TS 


Homo sapiens 


dentin siaiophosphoprotein 
precursor j 


279 


19 


461 




Homo sapiens 

] 


similar to a C.eiegan3 
protein encoded in cosmid 
"27F2 (U40419) 


9196 


99 


462 
463 

464 j 

465 | 4 


904044 

JWT002398 i 
\F06485"£ I 
\F223408 1 


tfomo sapiens "1 
-iomo sapiens J 

*attU3 sp. 

fomo sapiens ~ 


■iuman secreted protein, SEQ 
ID NO: 8125. 
'25965 1 
7acomp protein 

399 ""i 


486 
L018 

1845 1 
5686 H 


93 

100 

34 

?9~ 
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TABLE 2 



SEQ 
ID 

NO: 
466 
467 


ACCESSION 
NUMBER 

AF22340B 


"~| SPECIES 
Homo sapiens 


DESCRIPTION 
-, ~B99 


SMITH 1 - ' 

SCORE 
2878 


" % 
IDENTITY 

87 


468 
469 


AF104415 
U53450 

AL031297 


Mus musculus 

Rattus 

no rveg i c u s 

Homo sapiens 


gene crap locus- 13 

Oun dimerization protein 1 

JDP-1 

'CLJ97P20.1 (novel gene) 


6336 
196 


91 
" 49 


470 ' 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2B 
subunit 3 


3S64 " 
1274 


99 
95 


471 


L28125 


Podospora 
anserina 


beta transducin-like protein 


284 


38 


472 
473 


Y84903 
AF144237 


Homo sapiens 
Homo sapiens 


A human proliferation and 
apoptosis related protein. 
LOMP protein 


2337 


100 


474 


Y71213 


Homo sapiens 


Human irritable bovel disease 
related polypeptide IMX39. 


838 


44 
100 


475 
""47^ 


Y95006 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID 80:52 . 


34 11 


100 


477 


D38549 
AF241230 


Homo sapiens 
Homo sapiens 


hal025 is new " " 
TAK1 -binding protein 2 


" 6533 
3656 


'~99 
100 


478 


AL031534 


Schizosaccha 

romyces 

pombe 


putative asparagme synthase 


482 


40 


479 
480 


L28125 
AF161544 


Podospora 
anserina 
Homo sapiens 


beta transducin-like protein 
HSPC059 : 


233 
434 


26 
77 


481 
482 

483 
4 84 


AJ238248 
Z38061 

AF161381 
Ac 


Homo sapiens 
Saccharomyce ' 
s cerevisiae 

Homo sapiens 
Homo sapiens 


centaurin beta2 

mal5, seal, len: 1367, CAI : 

0.3, AMYH_ YEAST P08640 

GLUCOAM YIAS E SI (EC 3.2.1.3) 

HSPC263 

AD021 protein 


3986 
295 

1404 
1314 


99 
23 

100 
100 


486 
487 
488 


X57527 
Y19062 
Y73373 


Homo sapiens 
Homo sapiens 
Homo sapiens 


alpha l(VIIl) collagen 
39k3 protein 

HIRM clone 921803 protein 

sequence . 


4166 
2475 
555 


99 

100 

56 


489 
490 


AL021918 


Homo " — 
sapiens 


b34I8.i (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


491 


X53773 
U52426 


Rattus 
norvegicus 
Homo sapiens 


938) 
~G0K 


4675 


97 


492 
493 


AL359773 
AF226^14 


Leishmania 
major 

Homo sapiens 


possible threonine synthase 
ferxoportini 


1459 
702 


S9 
45 


"494 
495 


Z93241 
AF036977 


Homo sapiens 
Homo sapiens 


CuJ2j£2E13.X (novel n rn hpi n 

with some similarity to 
Drosophila kkaken) 
unknown ~ ' 


2929 
513 

1812 


100 
96 

100 


496 
497 


U93564 
Y91405 


Homo sapiens 
Homo sapiens 


p40 

Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126 . 


357 


45 
100 


498 
499 


AF069781 


Drosophila 
melanomas ter 


aem46-liJce protein " 


653 


43 


500 


Y16601 


Homo sapiens 


Human cell -cycle 
phosphoprotein CECYP-2. 


1658 


93 , 




X70944 


Homo sapiens 


PTB-associated splicing 

factor 


3883 


100 


501 

502 
503 
504 


AF02 7"^03 

&F282874 
^J249732 
*F208861 


PIUS 

musculus 
Homo sapiens 
Homo sapiens < 
Homo sapiens ] 


putative membrane-associated 

guanylate kinase 1 

nectin 3; PRK3 

38 protein 

3M-019 


205 

2856 
S69 


36 

99 | 
100 


505 - 

507 j 

508 I 


b09708 1 
<662 85 t 
300189 3 
i 


Homo sapiens < 
'.us musculus I 
*attus I 
lorvegicus 


complement component C2 
iCl ORF 

*a+,K+-ATPase alpha-subunit ; 


1629 
4022 
LIS 
5227 


100 
100 
13 
39 
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TABLE 2 



j SEQ 
ID 
NO; 
509 


"accession 

NUMBER 


~T SPECIES 


j DESCRIPTION 


| SMITH- 
WATERMAN 
SCORE 


IDENTITY 


510 
511 
512 


Y94971 

A3019038 
A3019038 
AB019038 


j Homo sapiens 

j Homo sapiens 
| Homo sapiens 
J Homo sapiens 


"" Human secreted protein clone 
fal71 _l protein sequence SEQ 
ID NO: 14 8. 

beta 1,4 mannosyl transferase 
"beta- 1,4 mannosyl transferase 
beta-1,4 mannosyl transf erase 


2176 

781 
1347 


100 

77 
100 


513 
514 
515 

516 


X84908 
XS2851 
AF186084 


J Homo sapiens 
j Homo 
1 sapiens 


phosphorylase kinase 
pept idyl prolyl isomerase 
epidermal growth factor 
repeat containing protein 


1520 
5729 
650 
3046 


99 
99 
76 
99 


51*7 


G03602 
U04706 


1 Homo sapiens 
J Bos taurus 


Human secreted protein, SEQ 

ID NO: 7683 . 
50 JcDa protein 


505 


99 


518 
519 


G00653 
AF161475 


Homo sapiens 
| Hoir.o sapiens 


Human secreted protein, SEQ 

ID NO: 4734 . 
HSPC126 


1749 
530 


77 

~ 1 ftf ; 


520 
521 


Y99364 
AF2S68S2 


Homo sapiens 
j Homo sapiens 


Human PR01475 (UNQ746) amino 
acid sequence SEQ ID NO: 88. 
PTPIiA " — 


1368 
3394 


100 
97 


522 


AE000995 


Archaeoglobu 
| s fulgidus 


chromosome segrega t ion 
protein (smcl) 


1295 


100 
20 


523 
524 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 
variable region 




57 


525 


AJ223830 


I Rattus 

1 norvegicus 


ARE1 " 


2950 


98 


526 


W01535 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1276 


83 


527 


AF1456S8 


Drosopbila 
melanogaster 


BcDNA . GH1 0229 


320 


33 


523 


AF112213 
~D49387 


Homo sapiens 


protein 


-524 


79 


529 




Homo 
sapiens 


NADP dependent leukotriene b4 
1 2 - hy dr oxyde hy d rogena a e 


i£i<; 


100 


530 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 


531 


AL079335 


Homo sapiens 


dJi32F21.3 (72T1 KDa protein — 
(DKFZP564A03 2 , SBBI88) 
similar to mouse IFN-gamma 
induce MG11. J 


^1059 


99 


532 


Y91506 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179. 


1159 


98 


533 


X76116 


Caenorhabdit 


carrier protein <c2) 


576 


50 


534 


X76116 ( 


Caenorhabdit 
is elegans 


"carrier protein (c2) 


506 


50 


535 


X12966 j 


Homo sapiens 


3-oxoacyi-CoA thioiase 
propeptide (424 AA) 


1972 


100 


536 
537 
538 


Y09267 

Z11773 1 
D84224 | 
D84224 ! 


Homo sapiens 

Homo sapiens 
Komo sapiens 
Homo sapiens 


j.xcj van -con earning 
monooxygenase 2 
SRE-ZBP 

methionyi tRKA synthetase 
methionyi tRNA synthetase 


2486 

2201 
4741 


100 

99 
99 


539 
540 
541 

542 


D84224 

D84224 - f" 
J03244 


Homo sapiens 
Homo sapiens 
Boe taurus 

Homo sapiens 


methionyi tRNA synthetase 
methionyi tRNA synthetase 

H+ ATPase 31kDa subunit (EC 

3.6.1.3) 
Human OXRE-11. 


3887 
2933 

4529 * - 
848 


99 
96 
99 
77 


543 


AF221712 

| 


Homo 
sapiens 


Smad- and 01 f -interacting 
zinc finger protein 


2301 
2151 


yy 

51 


545 J 


^£000919 I 
: 
t 

j c 


*tethanobacte 
cium 

:hermoautotr 
>phicum 


=onserved protein ; 


207 ; 


38 




^06669 '1 
1 c 


synthetic j 
ronstruct 


jreTGF-betal 

L 


!070 J c 


>9 
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TABLE 2 



SEQ 
ID 
NO: 
546 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


WATERMAN 
SCORE 


IDENTITY 


547 


Y02698 
AF112205 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 
WSB-l protein 


" 854 

2275 


93 
100 


548 
549 

550 


X60271 
AC016827 


Mus cnusculus 

Arabidopsis 

thaliana 


c-rel ■ 
putacive GTPase ' 


2264 
810 


~~74 
" 42 


551 


Y70400 
A3048365 


Homo " — 
sapiens 
Homo sapiens 


Human cell -signal ling 

protein- 2 . 

NEDD4-like ubicuitin ligase l 


429 


68 


552 
553 


Y57880 
AF1198SS 


Homo sapiens 
Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 
PR0184 7 


8290 
1112 

265 


99 
95 

67 


554 
555 


M17236 
AL078468 

""AC006S63 ■ 


Homo sapiens 
Arabidopsis 
cnaJL lana 


MHC HLA-DQ alpha precursor 
putative protein 


1332 
540 


100 
40 


556 

557 
558 


AK024487 


Homo sapiens 
Homo sapiens 


similar co Kelch proteins; 
similar to BAA77027 
(PlD:g4650844) 
riAjooose protein 


515 
1623 


44 
98 


559 
560 


M12140 
W74825 

X56581 


Homo sapiens 
Homo sapiens 

Homo sapiens 


pol gene protein; Xxx 
Human secreted protein 
encoded by gene 97 clone 
HAQEF73 . 
}unD protein 


" 117 

225 


4 8 
56 


561 
562 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


373 
2926 


88 
54 




.AL.139839 


Homo sapiens 


dJ1069P2.3.1 (novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BCDNA.GH09817 


289 


42 


564 

565 
566 
"56 7 
569 


AF052723 

AF161472 
Y28817 
U09848 
AF155113 


Feline 

leukemia 

virus 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


gag-pol precursor polyprotein 
gPr80 

HSPC123 

pt3a6_4 secreted protein, 
zinc finger protein 
NY-REN- 55 antigen 


1547 

439 
3338 
173 8 
3603 


43 

44 

100 

100 

93 5 


570 
571 
572 
573 
574 

575 


AF155113 
AL032821 
M69181 

Y59678 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


NY-REN- 55 antigen 

dJ55C23.1 (vanin 1) 

non-muscle myosin B ' 

non- muscle myosin B 

Secreted protein 108-008-5-0- 

E6-?l. 


1821 
7350 
7311 
772 


99 
98 
99 
98 
100 


576 


AJJ365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


578 


X0^745 
AB041642 


Homo sapiens 
Homo sapiens 


UxMA polymerase alpha- subunit 
(AA 1 - 1462) 

PAR- 6 


7619 


99 


579 
"580 


D86984 


rujKUj sapiens 


similar to yeast adenylate 
cyclase (SS6776) 


1342 
2446 


100 
100 


581 


AF165124 ~ 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


33 


582 


W88812 
U82319 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58. 
novel ORF 


2339 


99 


"583 

"584 : 


P92219 

ftJ22394 8 ] 


Homo sapiens 
(human) j 
Homo sapiens 


CRl protein. 
kha helicase 


342 
11425 


100 
99 


585 


Y08612 j 


iomo sapiens 

1 


SBJcDa nuclear pore complex 
arotein 


5608 
3874 


39 
39 


587 j 


1T42384 i 

£ 

^F129756 I 


iomo i 
sapiens i 
iomo sapiens J 


Maino acid sequence of 

Lv3io 7. 

1AT4 


1007 : 
1873 < 


17 
)8 
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ID 

NO: 
588 
589 


ACCESSION 
NUMBER 

AP131775 


SPECIES 
Homo sapiens 


DESCRIPTION 

Unknown 


SMITH - 
WATERMAN 
SCORE 
1929 


% 

IDENTITY 
99 


591 

592 
593 
534 


AJ250865 
298885 

L76571 
AF091622 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


TESS 2 1 

dJ522u/.2 (bro-oodomain- 
containing 1 (similar to 
peregrin, BR140) ) 
nuclear hormone receptor 
PHD finger protein 3 ~ 


2348 
4167 

1355 
9054 


100 
100 

100 | 

100 | 


535 
S96 

$97 


X56807 

AIil37802 

AL022329 

AF226048 


Homo sapiens 
Homo sapiens" 
Homo 
sapiens 
Homo sapiens 


j desmocoliin type 2a 
CU798A10.1 (novel protein) 
WC407K11.2 (adrenergic, beta, 

| receptor kinase 2) 
GL003 


4443 
~ 212 " 
3653 


100 ( 
S5 j 
100 


598 
599 


AJ278112 


sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3.S 
protein. 
[Homo 
sapiens 


1 yutdtive cen cycle control 
protein 


2009 
33S 


99 ) 
23 


600 


YS9741 
L36531 


Homo sapiens 
Homo sapiens 


[ Human normal ovarian tissue 

| derived protein 10. 

[ integnn alpha 8 subunit 


1574 


99 


601 
602 


Y38458 
AF218584 


Homo s ap i ens 
Homo sapiens 


* iw ** 4ic * 4 * sculc^cvi protein 
1 encoded by gene No. 20. 
i GGAl 


5386 
895 


99 j 

100 | 


603 
604 


Y13115 

AL132776 


Homo sapiens 
Homo sapiens 


kinase 

dJ393D12.1 (KIAA0776) 


3265 
5071 


100 j 
99 ' ] 


60S 

606 
607 


AL034452 
Y14494 


Homo sapiens - 
Homo sapiens 


w u ^ u i j . a inovei k_oj.xagen 
triple helix repeat 
containing protein) 
aralarl 


2413 
1979 

3465 


99 — j 
100 

99 J 


608 
610 


AJ001981 
X86098 


Homo sapiens 
Homo 

sapiens | 


OXA1L " 

binds directly to adenovirus 
type 5 E1A protein 


2603 
3049 


100 J 
100 


611 
612 


AF163572 
AF161503 


Homo sapiens 
Homo sapiens | 


Forssraan glycol ipid 

synthetase 

HSPC154 


1865 
1261 


99 

97 ! 


613 

614 
615 
616 


L41834 
Y919S4 

AL022327 

X8S786 

Y08319 


Ens is minor | 
Homo sapiens 

Homo sapiens j 
Homo sapiens f 
Homo sapiens | 


nuclear protein 

Human cytoskeleton associated — 
protein 9 (CYSKP-9) . 
dJ355C18.1 (KIAA0027) 
binding regulatory tactor 
Jcinesin-2 


34$ 

lit {to 

Jol 
3203 


30 j 
100 

94 j 

100 j 


617 
~619 

620 


D12644 
U28789 
Y35914 


Mus musculus | 
Mus musculua 
Homo sapiens 


KIF2 protein 

PACT ' — - 

Extended human secreted 
protein sequence, SEQ ID NO 
163. 


3487 

"5936 
"1684" 


99 
97 

89 ~j 
"99 


621 


AS046382 


Mus musculus 


test is -abundant finger 
protein 


199 


23 


"622 
623 


Y00062 
AF068286 


Homo sapiens r 
Homo sapiens 


precursor polypeptide (AA -23 

to 1120) 

HDCMD38P 


3440 
861 


99 

100 J 


624 
625 


X98248 
X6I10Q 


Homo sapxens | 
Homo sapiens 

1 < 


sortilin 

75 JtDa subunit NADH 
dehydrogenase precursor 


4436 
3734 


99 | 
99 


626 ' - 
627 


S58544 
*F151027 


Homo sapiens 
Komo sapiens j i 


75 Jcda infertility-related 

sperm protein 

4SPC193 


2125 
582 


99 

33 j 


628 ■ < 


£1496^8 "1 
K0911 ] 


•iomo sapiens | I 
■lomo sapiens (1 


Hi -alpha subunit (AA 1-404) ; 
ftiman fetal brain cDNA clone 
g?7_l derived protein 


2079 

1983 j 


LOO j 
LOO 
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ID 
NO: 
629 


ACCESSION 
NDM3ER 


SPECIES 


DESCRIPTION 


1 SMITH- 
WATERMAN 
j SCORE 


* 

IDENTITY 




Y50911 


Homo sapiens 


Human fetal brain cDNA clone — 
vb7_l derived protein 




100 


630 


AF098786 


Homo 
sapiens 


17 beta-hydxoxysteroid 
dehydrogenase type VII 


1754 


100 


631 


AL034555 


Homo 
sapiens 


dJ134019.3 (zinc finger 
protein 151 (pHZ-67)) 


4273 


100 


632 
633 


W74826 
AF288288 


Homo sapiens 
Home sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 
HPT protein 


794 
j 223£ 


96 
. 100 


634 
635 

636 
637 
638 


AF041429 
X663S7 

Y11284 

AR0048B4 

AJ0.02303 


Homo sapiens 
Homo sapiens 

" Homo sapiens 
Homo sapiens 


pRGRl 

serine/threonine protein 

kinase 

AFX1 

PKU-alpha " 

synaptogyrin 1c 


823 
1589 

j 2571 
3718 


99 
100 

98 
99 


639 
640 
"641 

642 
643 


AJ002304 
AJ002303 
D87682 

M14660 
X06661 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


synaptogyrin lc 
similar to a C.elegans 
protein encoded in cosmid 
T26A5. 
ISG-K54 

calbindin (AA 1-261) 


1020 
| 1002 
j 933 

267* 

j 24 73 
1 1358 


100 
100 
94 
' 100 

99 


644 
645 


AF119900 
AB031048 ~ 


Homo sapiens 
Drosoph.il a 
melanogaster 


PR02 822 " 

microtubule associated- 
protein orbit 


1 185 

738 


100 

76 

27 


646 
647 


AF250842 
X86691 


Drosopkila 
melanogaster 
Homo sapiens 


multiple asters 


334 


29 


64 8 "' 


U67934 


Homo sapiens 


ni"* protein 

44.9 JtDa protein C18B11 

homo log 


| 10110 
827 


99 
96 


649 


AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 


3330 


91 


650 


AL034553 


Homo sapiens 


, dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 

neuroprotective protein 

(Adnp) ) j 


5708 


100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
subunit | 


2388 


99 


654 
655 


AC004614 


Homo sapiens 


AB006086 (PIDtg252922S) j 


J Uzb 


99 


6S6 


Y57908 
Z34975 


Homo sapiens 
Homo sapiens 


Human transmembrane protein f 
HTMPN-32. j 
ldlCp 1 1 


608 
3733 


99 
100 


658 
659 

660 
661 


ALiQ5D10P» 

nu« J u j u u 

W76734 

AF202724 
Z21966 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 
Homo sapiens " 


dJ475B7.2 (novel protein) J 
Human mDia Rho targeting ~~ T 
protein. ~" | 
Sadl unc-84 domain protein 1 J 
mPOU homeobox protein j 


1942 
781 

2172 
1529 


99 
34 

100 
100 


662 
663 

667 


AJ242954 

r^J! X O Z J ± O 

AL161516 
X593 03 


Mus musculus 

Romo sapiens 

Arabidopsis 

thaliana 

Homo sapiens " 


dysferlin ( 
myoterlin 

hypothetical protein | 
valyl-tRNA synthetase 1 


4752 
6232 
209 


i9 
99 
JO 


668 


Y133S5 


Homo sapiens 


Amino acid sequence of r 
protein PRO220. | 


3393 

3*92 ■ " 


99 

100 


671 


ABO10692 
X56123 


Arabidopsis 
thaliana 

y -ms musculus 


contains similarity to endo- 
beta-N-acetylglucosaminidase 
gene | 
talin | 


611 


52 


672 " 

673 i 

674 k 
"675 1 


AB039371 

\F269223 J 
\F229633 " I 
1*14463 I 


Homo sapiens 

iomo sapiens ' 
nus musculus < 
Cactus T" 


mtochondrfai ABC transporter 
3 j 

rcpn f-j 

jroucho- related protein 4 * 
r transducin p 


4474 j 
2902 

306 
1053 

J619 < 


76 
99 

12 
99 
J2 
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SEQ 

III 

NO: 


ACCESSION 
NUMBER 


SPECIBS 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 






norvegicus 








676 


AC005757 


Homo sapiens 


R32611_l 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolcg=pol {retroviral 
element} 


252 


65 


678 


Ar 1 i 1 Job 


Homo sapiens 


CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-l 


1783 


100 


680 


AP118566 


Wus rausculus 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y51415 


Homo 
saoiens 


Human wild type pXe83 
protein. 


2621 


99 


682 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb341 protein 
sequence. 


5888 


99 


684 


V94952 


Homo sapiens 


Human secreted protein clone , 
fhll6_ll protein sequence 
SEQ ID NO: 110. 


354 


98 


685 


AL021878 


Homo sapiens 


dJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 


154 


67 


686 


AE000198 


Escherichia 
coli 


orf, hypothetical procein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


99 


688 


AF039697 


Homo sapiens 


antigen NY-CO- 31 


508 


98 


689 


U09355 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 B 
gamma subunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY-REN- 3 6 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-5 


1542 


100 


692 


X90530 


Homo sapiens 


ragB 


1926 


99 


693 


X90530 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5644. 


330 


100 


696 


AC011810 


Arabidopsis 
t ha liana 


Putative methionine 
ami nopep t i da s e 


669 


52 


697 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma- 1 


5364 


99 


699 


Y994Q1 


Homo sapiens 


Human PR01327 (UNQ687) amino 
acid sequence SEQ ID NO: 21 8. 


1386 


100 | 


701 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1736 


99 


705 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08698 


Homo sapiens 


raabp3 


2849 


93 


711 


Y68770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2. 


754 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


~1 SPECIES 




SMITH - 
WATERMAN 


IDENTITY 




712 


U93574 


Homo sapiens 


putative pi 50 


799 


59 




713 


AC004531 


Homo sapiens 


box helicases 


£. / AS 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 




715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


""98 




716 


AL137013 


Homo sapiens 


bA3HP8.3 {probable uracil 
pnospnor loosy i c ran i erase / 


862 


100 




717 


AB035123 


Mus raus cuius 


GDI alpha/GTla alpha/GQlb 
alpha synthase 


1696 


93 




718 


' Y96290 


OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


numan lOrAM-2 immunoglobulin. 


2345 


85 




719 


X07979 


Homo sapiens 


mtegrin beta 1 subunit 
precursor 


4347 


99 




720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


93 




721 


Y07595 


Homo sapiens 


transcription factor TFIIK 


2373 


100 




722 


W41565 


Homo 

sapiens] 

>W41564 

N41564 08- 

OCT- 19 97 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain. 


1591 


99 




"723 


ri-T 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC0067O8 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
Kisii : Z72876; 


1143 


"46 


726 


AC0067O8 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


988 


46 


727 


AC024818 


is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 

ucsu<* repeat / , score^o i . a , 
E~1.4e-20, N«3 


9S0 


44 


728 


AJ005897 


Homo sapiens 


JM5 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 

V JfacJITlPTI h Pnrnr^Pi^ F'mm /-ranca 

^^oyiiciiii ciiluucu om gene 
27. 


908 - 


97 


73 0 


G03931 




ID NO; 8012. 


578 


100 


731 


AB012 72O 


Oncorhynchus 
ma sou 


GTP-binding protein 


3865 


76 


732 


W73404 ] 


Homo sapiens 


Human secreted protein 
n c.(~ir^p<^ Hvf Ron o yjf\ a 


862 


97 


73 3 


G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


"97 


734 


AC024 813 


Caenorhabdic 
is elegans 


Hypothetical protein 
Y54F10AL.a 


152 


24 


735 " 


At0354 61 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohol' 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


UO0033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-protein 
transferase 1-lp; ATEl-lp | 


2733 


99 



161 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




WATERMAN 
SCORE 


IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


2793 




100 


73 9 


AJ133115 


Homo sapiens 


TSC-22-Iike protein 


2054 


99 


740 


X9B258 


Homo sapiens 




9S3 


100 


741 


X98258 


Koroo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 


Caenorhabdi t 
is elegans 


sub- family of RAS proteins 


jo U 


85 


743 


X76057 


Homo sapiens 


phosphomannose isomerase 


2191 


10 0 


744 


G03209 


Homo sapiens 


ID NO: 7290. 




98 


745 


X97064 


Homo sapiens 


Sec23 protein 




99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y733B8 


Homo sapiens 


HTRM clone 3376404 protein 
sequence. 


1565 


99 


749 


M19529 


Sus scrofa 


f ol I ififat" "i n A 


1906 


98 


749 


AJ249457 


Trichomonas 
vaginalis 




183 


28 


750 


AC0Q4410 






2094 


100 


751 


AF074968 


Homo sapiens 




2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 


4005 


100 


753 


AB049629 


Homo ssn i pn i 


phosphoiysine 

phosphohistidine inorganic 
t* uy* ifjajjiia l e pno apjid c ase 


1375 


99 


754 


D79205 






160 


77 


755 


AB008430 






142 


29 


758 


L32162 


Homo sapiens 


transcription factor 


574 


80 i 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


sapiens 


Human cell signalling 
protein- 13 . 


625 


100 


761 


AF218586 




Llue-D 


1136 


100 


762 


U38934 


Gallus 


his tone H2A 


625 


97 


763 


AF226053 


nutuu oapicno 




606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar tc a C. elegans 
protein encoded in cosmid 

U/tii IU4U413J 


568 


38 


766 


AL023828 


is elegans 


i 4. /Vj 1 a • ±H 


200 


27 


757 


Y82777 


Homo sapiens 


Human chordin related protein 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo aaoif*n<? 


Human calcium binding protein 
3 JCaBP-3) 


1426 


100 


770 


X51416 


Homo sapiens 


hormone recpotor hiPRPi f aa i 
S21) 




97 


771 


AJ006591 


Homo sapiens 


CV9teine-rirh nrnhpin 


1793 


100 


772 


A08695 


Homo sapiens 


rap2 




100 


773 


Z12173 




N-acetylglucosamine- 6 - 
sulphatase 


"2970 


100 


774 


Y91950 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


DOS 


4 J 


776 


AL023799 


Homo sapiens 


dJ322P7.1 {zinc finger) 


OjD 




777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


8*5 


56 


778 


G01880 


Homo sapiens 


ID NO: 5961. 


84 9 


98 


779 


AJ012590 


Homo sapiens 


glucose 1-dehydrogenase 


4155 


99 


780 


AL078S82 


Homo sapiens 


dJ130E4.2 (KIAA0796) 


1321 


68 


781 


Z75955 


Caenorhabdi t 
is elegans 


Similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJH21G12.2 {SCAN domain- 
containing 1 protein) 


900 


1G0 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 




DESCRI PTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 








ID NO: 7954 . 






785 


Y84441 


Homo sapiens 


Amino acid setjuence of 3 
human RNA- associated 
orotein 


2074 


100 


78S 


YO0918 


Homo sapiens 


Human Rab protein, RABP-1, 

piuLcm sequence . 


1048 


" 99 


787 


297029 


Homo sapiens 




1548 


S9 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


_ 962 


S4 


789 


AF024631 






2644 


100 


790 


' AJ006710 


Rattus 
T\oy-\/p<i i f ii ^ 

V^VJ 1UU J 


phcsphatiaylinositol 3 -kinase 


4508 


97 ~~ 


792 


V00638 


bactenophag 


reading frame ealO 


600 


100 


793 


AF049103 




Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


"Utw sapiens 


desrcujglein 2 


4810 


]~99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7 sequence . 


5080 


99 


797 


U15155 


' Gallus 

gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 




•Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF2347S5 


Rattus 
norvegicus 


serine- arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


801 




Homo sapiens 


placental protein 13-like 
protein 


743 


99 


802 


AT 4UOt>3i 


Homo sapiens 


BM-009 


766 


80 


803 


Z81097 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662dl2.5 
comes from this gene 


152 


27 


804 




nonio sapiens 


Human secreted protein, SEQ 
ID NO: 6194. 


496 


"98 


805 


AL121673 




bA305P22.1 (novel protein) 


1160 


ICO 


806 


AC013483 


Arab idop sis 
thaliana 


putative GTPase activator 
protein 


264 


30 " 


807 


AC013483 


Arabidopsis j 
thaliana 


putative GTPase activator 
protein 


264 


3C 


80S 


AB013885 I 


nuuiu sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 




Homo sapiens 


H3PC3 03 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


similarity to C. elegans ~ 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


Z73497 


nomo sapiens 


CU240U2.2 (Core histone 


324 


100 


814 


W87689 


sapiens 


nunictn tiiAr iiy polypeptide* 


1484 


99 


815 


X16282 


sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobact eriu 
m 




300 


36 






tuberculosis 








818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL117555 


Homo sap i en 9 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 " 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


schizosaccha 


caffeine -induced death 


184 


29 
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WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SM1TH- 
WATERMAN 


* 

IDENTITY 


825 


AJO06692 


roinyces 
pombe 

Homo sapiens 


~ protein l : 

ultra high sulfer keratin 






"826 
" 827 


U23037 - 
~"ooT5T5 


Oryctolagus 
cuni cuius 


eIF-2Bepsilon 


O ?J 

3406 


63 
90 


B28 




Homo sapiens 


human secreted protein, SEQ 
ID NO: 74 93. 


464 


100 




Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


Y32199 


" Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 
832 


W78279 
ABU11542 


" Homo sapiens" 
Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 
MEGF9 


1264 


99 j 


"833 


G02639 


Homo sapiens" 


Human secreted protein, SEQ 
ID NO: 6720. 


2097 
223 


100 
70 


834 


AFI19664 
' AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 
836 




Homo sapiens" 


transcriptional regulator 
protein HCNGP 


1144 . 


89 


83 7 


AF119664 
X12517 


Homo sapiens - 
Homo sapiens 


— ™^ ******* la^UlaLUI 

protein HCNGP 


144 8 


94 


838 

839 
840 


U32865 

AF067730 
U2 7831 


Drosophila 
melanogaster 
Homo sapiens - 
Homo sapiens 


lino tte protein 
TLS-associated protein TASR-2 


918 
164 

631 


100 
24 

56 


841 
842 

843 


AF286366 
G02309 


Homo sapiens 
Homo sapiens 


stria turn- enriched phosphatase 
CamKI-like protein kinase 
Human secreted protein, SEQ 
ID NO: 6390. 


2840 
1796 
278 


98 j 

100 

98 


844 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 




G01350 


Homo sapiens - 


Human secreted protein, SEQ 
ID NO: 5431. 


529 


1C0 


845 

847 
848 
849 


U2783 6 

Y87788 
AF164794 


Mus mus cuius 

Homo sapiens 
Homo sapiens - 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homol og 

Human RBP-26 protein. 

Dif f 33 orotein homo loo ' 


3305 

2026 
2398 


"9* 

100 
100 


850 
851 

852 


U41315 
AF192 784 
Y58628 

Z22968 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 


2NF127-Xp 
makorin 1 

"Protein regulating gene 
expression PRG3-21. 
Ml 30 antigen 


2458 
2062 
1548 


93 
97 
100 


853 
'8S4 


Z22971 


Homo sapiens 


variant 


6205 
6380 


100 

160 




G033 62 


Homo sapiens - " 


Human secreted protein, SEQ 
ID NO: 7443 . 


330 


96 


855 
856 


G03362 
AF285118 


Homo sapiens - " 
Homo sapiens 


Human secreted protein, SEQ 

ID NO: 7443. 

CGi-203 


203 


100 


857 
858 




Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specif lty 
factor 


452 
1383 


100 

55 




AL021546 


Homo sapiens - " 


cytochrome c Oxidase 
Polypeptide Via -liver 
precursor (EC 1.9.3.1) 


593 


100 


859 

860 
861 


L02956 

f\F201947 
ti31783 - I 


Xenopus 
laevis 

£omo sapiens 
ius trruscuius i 


ribonucleoprotein \ 

>JEK binding partner 1 
uridine kinase 


1664 
516 


85 
100 


862 i 
863 

864 i 


(VF161472 I 
249068 ( 

\F154108 } 


-lomo sapiens ] 
-aenorhabdit i 
.3 elegans 
lomo sapiens - "1 


1SPC123 j 
iu.tochondrial carrier protein " ; 

:umor necrosis factor type i ; 


L266 

502 

170 

1559 f 


92 
73 
43 

99 



164 
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TABLE 2 



1 SEQ~ 








j ID 
1 NO : 


ACCESSION 
N0MBER 


"T SPECIES 


DESCRIPTION 
receptor associated protein 


" SMITH-" " 
WATERMAN 
SCORE 


IDENTITY 


|865 
866' 


AEO01S30 


Helicobacter 
pylori J99 


putative 


230 


32 


"867 


XS7807 




immunoglobulin lambda light 
chain 


~699 


91 


1 868 


AL031673 * 
" Y11652 


Homo sapiens 


dJ694B14.1 {PUTATIVE novel 

KRAB box protein with 18 C2H2 
type Zinc finger domains) 


4066 


99 


r869 


AF192S68 

AB020648 
AL031427 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


phosphate cyclase 


~~238 


""Too ~~ 


[870 
[ 871 


high-glucose- regulated " — 
protein 8 
KIAA0841 protein 
dJl67A19.l (novel protein) 


3041 
3237 


""99 
S9 


[872 
[873 

[074 


AF151534 
AL021331 

X14608 


Homo sapiens 
Homo sapiens 

Homo sapiens 


core histone macroH2A2.2 
djj&6N23.1 (putative C. 
elegans UNC-93 (protein 1, 
C46F11.1) LIKE protein) 
propionyi-CoA carboxylase 


1608 
1866 
1129 


100 
100 
100 


f 875 
[ 876 


AL117334 


Homo sapiens 


QJ687F11.1 (novel protein 

(part of translation of cDNA 
DKFZp434N06l, Era:AL110249) ) 


3579 
306 


100 
100 


f 877 


X79489 


Saccharomyce 
s cerevisiae 


E-925 protein 


446 


35 


[ 878 


YS3 001 
AF2310g4 


Homo sapiens 
Homo sapiens 


uutuau ocutcieu protein clone 
dn834_i protein sequence SEQ 
ID NO: 8. 
CKMP1.5 


811 
957 


100 
100 


[ 879 
880 


X79417 
"AF001317 


Sua scrofa 
Saccharomyce 
s cerevisiae 


40S ribosomal protein S12 
Soilp 


687 
478 


100 ~ 
28 


[ 881 
[ 832 


Y87275 
M14036 


Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO .'52 . 

CI -inhibitor " 


2547 


100 


[ 883 
["884 


AB041261 
AF020313 


Homo sapiens 
Mus mus cuius 


calcium- independent 

phospholipase A2 

proline -rich protein 4~8 ~ 


598 
2903 

999 


77 
100 

84 


885 
[T86 

887 


Y10936 
AF073997 


Homo sapiens 
Mus mus cuius 


hypothetical protein 
"myotubularin related protein 
1 


1104 
866 


36 


[ 888 


Y57893 
ALII 763 5 


Homo sapiens 
Homo sapiens 


Human transmembrane protein 

HTMPN-17. 

hypothetical protein 


1099 


94 


[ 889 


AF210317 


Homo sapiens 


facilitative glucose 
transDorter famil v mpmhor 
GLUT 9 


929 
2046 


99 
99 


890 
f 891 


Y36031 


Homo sapiens 


Extended human secreted " 

protein sequence, SEQ ID NO 
416. 


583 


160 




Y36031 


Homo sapiens * 


Extended human secreted ~ " 
protein sequence, SEQ ID NO 
416. 


192 


57 


892 
893 


AF237631 
AF090929 


Homo sapiens 


ubiquitous tropomodulin CJ- 
Tmod 

PRO 04 7 7p ~ 


i.798 


100 


894 
[ 895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar c 

cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


653 
3196 


99 
100 


896 i 


AL 531228 
*F171102 J 


Homo sapiens 

I 

-fomo sapiens '- 


dJ1033B10.2 (WD4 0 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
WG10 and C. elegans F28D1.1) 


2825 


96 


897 1 


*E003 551 "~ l 
r 


3rosophila < 
nelanogaster 


retinal degeneration B beta 
.G18176 gene product ^ 


1302 

533 ; 


§5 
S3 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
SCORE 


luaci i l x i 


898 


AJ237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


i nn 


899 


Z97184 


Homo sapiens 


HKE2 


624 


100 


900 


Z97164 


Homo sapiens 


KKE2 




Qg 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 




1011 


100 


903 


R95953 


Homo sapiens 


Eukaryctic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophila. 
melanogaster 


tuiu jot y ciic UZuUULpL 


446 


33 


906 


1455542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


98 


907 


W55542 


Homo sapiens 


yuaiiyiatc jJlJiuxriy protein 

isoform I 


2901 


96 


908 


W84085 




nuitidii iHSiiiDi ariQ iusiori protein 
WDProl . 


1889 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protain 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952 . 


321 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 


87 


913 


AJ243721 


Homo 
sapiens] 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP - 4 - ke to - 6 - deoxy- D- glucose 
4 -reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Method: conceptual 
translation supplied by ■ 
authors 


244 


41 


915 


Y02591 


riuuoj sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


915 


AE000984 


Archaeoglobu 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


913 


M23159 


cricetus 


DHFR-coampli f ied protein 


163 


30 


919 


LI 2 01 8 


is elegans 


putative 


1232 


41 


920 


AF102177 




tuinuf anuxgen oiif— op 


1260 


97 


"921 


AL096712 




uu rttiz^ . £. \siuiiiar co a 
novel human gene mapping to 
Activator ) 


1017 


78 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


86* 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 


605 


51 ~ 


925 


X71978 


Mus musculus 


Fi£ ■ ■' 


1503 


95 


926 


K92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_i. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose- 5 -phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESS XON 
NUMBER 


SPECIES 
is elegans 


DESCRIPTION 

Cm21c7 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


932 
933 

934 


G01384 


Homo sapiens 
Homo sapiens 


hypothetical protein 

Human secreted protein, SEQ 

ID NO: 5965. 


210 
767 


25 
98 




AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


' 1200 


1UU 


935 

936 
937 


AL035681 

AB026808 
AB015345 


Homo sapiens 

Mus raus cuius 
Homo sapiens 


d»J756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 
synaptotagmin XI 
HRIHFB2216 


1142 
2142 


BU 

95 


938 
939 


X65724 
W89024 


Homo sapiens 
Homo sapiens 


0RF2 

Polypeptide fragment encoded 
by gene 156. 


2601 

498 

1487 


99 

100 

100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 


100 


941 


AF094583 


Homo sapiens 


putative HIV-1 inrection 
related protein 


4 52 


100 


94 2 
943 


AC024200 
AF129756 


caenorhabdit 
is elegans 

Homo sapiens 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 

"GTS — 


350 


69 


944 


K23765 


Rattus 
norvegicus 




273 


100 
96 


945 

94 6 
94 7 


AC009917 

AF223458 
AF05S473 


Arabidopsis 
thaliana 
Homo sapiens 
Homo sapiens 


Contains similarity to 
AD021 protein 

CiAGK-8 


"Trt 

EC, ' 

b bl 
273 


47 

44 
51 


94 8 
949 
950 


X7S756 

AF143956 

Y36729 


Homo sapiens 
Mus mus cuius 
Homo 
sapiens 


protein kinase C mu 
corcnin-2 

Human PG1 protein sequence. 


2019 
2300 
1861 


68 
93 
99 


951 

OCT 


W49041 




niutiAji iww uenaicy lipoprotein 
binding protein LBP-2. 


282 


67 




AB016881 


Arabidopsis 
thaliana 


gene_id:MXC17.7~ 


203 


46 


9S3 

154 


Y017B5 


Homo sapiens 


HUBlSn tlbi mi I f i n . Pnnnnns ^ i nn 

enzyme >Y25341 Y25341 01- JUL - 
1999 12-AUG-1998 Human NCB-2 
protein. 


365 


100 




AF14S61-5 


Drosophila 
melanogaster 


BCDNA.GH03377 


823 


46 


955 
956 


U09410 
U09410 


Homo sapiens 
Homo sapiens 


zinc finger protein 2NF131 
zinc finger protein ZNF131 


2483 


99 1 " 


957 


flg-i qcgo-a 

/VP I^JDZ J 


Homo sapiens 


cholinephosphotransferase 1 
alpha 


1853 
2126 


99 
99 


958 


X94917 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 Jcb 


155 


32 


959 
960 


U54807 
AF058807 


Rattus 
norvegicue 
Bos taurus 


GTP- binding protein 
GTP-oinding protein rah 


1167 


97 


961 
962 


G03244 
AF078850 


Homo sapiens 
Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


606 

4 71 ^~ 


97 
100 


963 


AP001754 


Homo sapiens 


transient receptor potential- \ 
related channel 7 f a novel 
putative Ca2+ channel protein 


583 
317 


4 0 
30 


964 


AL03S419 


Homo sapiens 


dail00H13.1 (putative novel 
protein) 


1129 


100 | 


965 


X613&1 


kattus 
rattus 


interferon -induced protein 


202 


46 


966 


D38169 


Homo 
sapiens 


inositol 1,4. 5- triphosphate 
3-kinase isoenzyme 


3278 


100 


967 j 


\L031432 


Homo , 
sapiens j 


AJ465N24.2.1 (PUTATIVE novel 
arotein) (isoform 1) 


393 


LOO 
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TABLE 2 



SEQ 
ID 
NO: 
968 


ACCESSION 
NUMBER 

U79275 


SPECIES 
Homo sapiens 


DESCRIPTION 


SMITH- 
WATERMAN 


V 

IDENTITY 


969 


AJO11306 


Homo 
sapiens 


guanine nucleotide exchange - 
factor (long isofortn) 


611 
2752 


100 
99 


970 


AF281134 


Homo sapiens 


exosooe component Rrp46 ~ 


X lob 


100 


971 


US3336" 


Caenorhabdi t 
is elegans 


weak similarity over a short 
reQion to rovoa^n heaw rh^in 


536 


23 


972 


AC018749 


Leishmania 
major 


L8840 . 12 


589 


53 


973 


AP183504 


Mus mus cuius 


LNV 


544 


85 


974 


025801 


Homo sapiens 


Taxi binding protein 




98 


975 


AF049523 


Homo sapiens 
1 


hun t ing t in - i n t e rac t ing 
protein HYPA/FBP11 


1390 


97 


976 


AP161530 


Homo sapiens 


HSPC182 


"1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


978 


AF164797 




ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
1 a evi s 


transcription factor XLM01 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrina 


2029 


100 


981 
982 


¥94888 
AJ243191 


sapiens 
Homo sapiens 


Human protein clone HP01462 . 
heat shock protein 


2501 


100 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


827 
964 


96 
85" 


984 


AJ249207 


Rhodococcus 
ana it 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 
— , 


basic transcription factor 2, 
35 kD subunit 


1S76 


99 


986 




Homo sapiens 


contains two glutamine rich 
domains, three zinc- finger 
domains, and matrin 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF22725B 


Bos taurus 


RPGR- interacting protein-1 


1262 


38 


988 


AL022238 


Homo cani pne 


uoiu^zKiiu.^ (supported by 
GENS CAN, FGENES and GENEWISE} 


4048 


99 


"989 


AL02223B 




QiJiu^^Kiu.ii (supported by 
\j£tiSi9KJ\St , rGSNES and GENEWISE) 


2321 


99 


990 


AF161425 




**>J t \_ J u o 


448 


92 


991 


AF161426 


Homo sapiens 


HSPC308 


448 


92 


992 


AP16142S 


Homo can i one 




453 


92 


993 


AL023859 


Schizosaccha 

rontycea 

pombe 


trna- splicing endonuc lease 


172 


42 


994 


AL049631 




uuDxjjnj.i movei nomeobox 
domain protein) 


241 


47 


995 


AC0C5253 


Homo sapiens 


R26445 1 


902 


100 


996 


AF265206 


Eomo sapiens 


M0G1 isoform A 


974 


100 


997 


AJ248285 


Pvroropni a 

abyssi 


sar cosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE003641 


cnelanogaster 


ou . jjo uu 3*t ± . j gene product 


218 


53 1 


999 


W69343 


Homo 
sapiens 


Secreted protein of clone 
CR930 1. 


1340 


93 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number ! 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


"ioo 


1003 


AE004944 


Pseudoraonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2053 


100 


1005 


S45367 


Can is 

familiaris 


centractin 


1949 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESS ION 
NUMBER 


SPECIES 


! DESCRIPTION 


SMITH- 
w>*l fcitPlArt 

SCORE 


IDENTITY 


1006 


S4536"7 


Canis 

familiaris 


centractin 


J. -J -LO 


98 


1007 


AB022158 


Mus 

musculus 


epsilon subunit 


2649 


— 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc singer 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


259 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc ringer 
protein 


1S71 


58 


1012 


Z14000 


Homo sapiens 


RING1 


2017 


100 


1013 


G02841 


Homo sapiens 


ID NO: 6922. 


332 


93 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH1 0 3 3 3 


1244 


52 


1015 


Y02860 




Drotein prtcnHpH hv «-r*»r»*» 


664 


67 


1016 


Y02S91 


Homo sapiens 


A human progesterone receptor 

COTtlOleX n2^-lilr^ nrrtho'tn 
i-um^/ACA j;*j"axac pruLcjiri . 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 

acid RPffli^nro Cpfi t t\ Mr^ .in a 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n— chimaerin 


■ 

1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins 1A/1B light chain 3. 


631 


100 


1020 


AF16479S 




bOA-teguidtea procein janus-a 


674 


100 


1021 


AF190655 ~ 


cotumix 


qay j. - 1 


638 


96 


1022 


AL133363 


Arabidopsis 
thai iana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


2483 


100 


1024 


AY007091 




similar to Homo sapiens 
mammalian inositol 
jiCAaMij^aoapaate Kinase & 
(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 




P63 nrotpin 

****** XT '-'^C Ail 


2958 


99 


1026 - 


U8Q73£ 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET - 1 " ' ~~~~ 


1048 


54 


1028 


AB032931 ■ 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 




ID NO: 5878. 


/49 


98 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


98 


1031 


AF193 795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schi2osaccha 

romyces 

porafae 


DNA2-NAM7 helicase family 

protein 


685 


31 


1034 
1035 


Y41519 
AJ276004 


Homo sapiens 
Mus musculus 


protein encoded by gene 75 . 
Paxneb protein 


■L ozx 


99 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 1 


1709 
190 


77 
30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc t'inger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein 
BA0306. 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis tha liana 
ubiquitin-like protein 8 


331 " 


80 
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TABLE 2 



S2Q- 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

wAl iSKMAw 

SCORE 


IDENTITY 


1040 


AF2S02O4 


Homo sapiens 


blood QrouD carrier" tno^pmlp 
DOK1 


i ci n 
lb J / 


99 


1041 
1042 


Y96730 
AF140683 


Homo 

sapiens 

Mus musculus 


PR0539, a Costal-2 honr.ologue. 
F-box protein FWD2 


162 
2397 


22 
98 


1043 
1044 


AF151023 
AP181631 " 


Homo sapiens 

Drosophila 

melanogaster 


KSPC189 
BcDNA . GH0 4 929 


1104 
204 


100 
37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


1940 


100 


1046 


AJ243372 


Homo sapiens 


O " DilO ^ Dflf^O" 1 1 1 f"*f~\ n /"V "1 d p h /'n m >\ a n 
« jg/ uwawituy xuv>uuoidCCOilaS6 


1317 


100 


1047 


A303S863 




synthetase beta subunit 
precursor 


2324 


99 


1048 
1049 


AL034550 
AF163825 


Homo sapiens 
Homo sapiens 


QJ1184F4.2 (novel protein 
similar h r> ntirlAnlar nvAHa« 
4 {IT0L4) (NOLP)) 


981 


92 


1050 


AF201949 


Homo sapiens 


° -tyii l piit»t.y v.e prOCcm J 

60S ribosomal protein L30 
isolog 


634 
868 


100 
100 


1051 


AFI90624 


Mus musculus 


mdgl - 1 


236 


85 


1052 


AE003529 


Drosophila 
mel anogas t er 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 


646 


98 


1054 


AL152756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

amidotransf erase subunit A 


682 


44 


1055 


AF131856 


Rat t us 
norvegicus 


tRNA selenocyeteine 
associated protein 


1525 


99 


10S6 


U89649 


Chlaraydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 
1059 


AF230929 
AJ270952 


Homo 
sapiens 
Homo sapiens 


fceratinocyte annexin-like 
protein peraphaxin 


1710 


99 


1050 
10S1 


AF224263 
X63417 


Heterodontus 
f rancisci 
Homo sapiens 


putative membrane protein 
HOXD8 

IRLB 


1363 
742 


100 
83 

100 j 


10S2 


AL079345 


Streptomyces 
coeli color 
A3 (2) 


hypothetical protein 


1037 
143 


27 


1063 
1064 


Y71112 
AF363614 


Homo sapiens 


nuuicm nyuiOidSc protein— 10 
(HYDRL-10) . 


2547 


100 


1065 


Y13356 


Homo sapiens 


o^cuyi oyntiiecase 
Amino acid sequence of 
n rote* in PPn^^i 


3493 
1363 


99 
100 


1066 


AC006153 


Homo sapiens 


similar to Aquizex aeolicus 
GTP-binding protein; similar 
to AE000771 (PID;g2984292) 


*62 


98 


"1067 


Y18930 


Sulfolobus 
solfataricus 


hypothetical protein 


162 


29 


1068 


R65969 


Homo 

sapiens T986 


Glioblastoma -derived 
polypeptide . 


887 


100 


1069 


Y07964 


Homo sapiens 


Human secreted protein 
fragment 


863 


96 


1070 
1071 


AF1774 76 
AF24550S 


Rattus 
norvegicus 
Homo sapiens 


CDK5 activator-binding 

protein 

adlican 


1995 


85 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


3109 
14 7 


99 
36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


69B 


98 


1074 
107* 


U15779 
Y13392 " 


Homo sapiens 
tiomo sapiens 


P 70 

Amino acid sequence of r 


3 80 
1271 


28 
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TABLE 2 



SEQ 
ZD 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
protein PR0328. 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


107S 
1077 

1078 


AF161457 
Y79509 

AF223466 


Homo sapiens 
Homo sapiens 

Homo sapiens 


HSPC339 

Human carbohydrate-associated 
protein CRBAP-5. 
HT015 protein 


571 


100 
98 


1079 
1080 


AL132965 
AB024937 


AraJaidopsis 
tha liana 
Homo sapiens 


putative wd- 40 repeat -protein 
LUNX 


831 
286 


■gg 

29 


1081 


Y14768 


Homo sapiens 


v-ATPase G-subunit like 
protein 


1284 
579 


J.UU 
100 


1032 


AF016416 


Caenorhabdit 
is elegans 


F29A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 




802 


45 


1084 


AB041541 


Mus musculus 


unnamed protein product 


151 


44 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


202 


97 


1086 


AB030S14 


Homo sapiens 


H-REV107 protein homolog 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
protein 


114 2 


100 


1088 


Y84432 


Homo sapiens 


Amino acrid Qpnupnro /-» (• a 
human RNA-associafced 
orot ein , 


2783 


100 


1089 

1090 
1091 


Y948£7 

AK023 982 
AB041586 


Homo 
sapiens 
Homo s an i en s 
Mus musculus 


Human protein clone HP10563. 

uimciinea protein product: 
unnamed protein product 


613 
130 


100 
49 


1092 
1093 


Y71277 
U34973 


Homo sapiens 
Mus musculus 


Human Zlipo3 protein, 
protein tyrosine phosphatase - 
like 


1103 

606 

1131 


81 

100 

95 


1094 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


522 


"56 


1095 


Y87276 




Hjrnan signal peptide 
containing protein HSPP-53 
SEQ ID NO; 53. 


1029 


99 


1096 
1097 


Y87276 
AF161455 


Homo sapiens 
Horao sapiens 


containing protein HSPP-53 

SEQ ID NO: 53. 

HSPC337 


863 


98 


1098 
1099 


U80029 
AJ005866 


Caenorhabdit 
is elegans 
Homo sapiens 


similar to thioredoxin 
Sqv-7-like protein 


742 
242 

1321 


98 

39 ■ - 
99 


1100 
1101 
1102 


AuT005865 
AJ00586S 
AJO0S865 


Homo sapiens 
Homo sapiens 
Homo sapiens 


sqv-7-lijce protein 
Sqv-7-like protein 
Sqv-7-like protein 


1118 

891 

1016 


99 
99 
99 


1103 
1104 


AL110244 
AF242194 


Homo sapiens 

Drosophila 

raelanogaster 


hypothetical protein 
brakeless-B 


299 
147 


31 
52 


1105 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
Drotein similar' to r* ^l^rranc 
C02C2.5) 


968 


100 


1106 
1107 


U28016 
AJ278150 


Mus musculus 
Homo sapiens 


(phosphodiesterase) -related 
protein 

putative lipid kinase 


1624 


87 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7814. 


2207 
495 . 


99 
98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Horao 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 
1112 


Y28921 
AF176704 


Horao 
sapiens 
Homo sapiens 


Human regulatory protein 
HRGP-7. 

F-box protein FBX9 


1331 


51 


1113 
1114 


AF182076 
G04039 


Horao 
sapiens 

Homo sapiens - ~ 


glioma tumor suppressor 
candidate region protein 2 
iuraan secreted protein, SEQ 


2027 
2418 

475 


99 
100 
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SEQ 
ID 
NO: 

1115 


ACCESSION 
NUMBER 

Ar 2294 3 9 


SPECIES 
Mua musculus 


DESCRIPTION 
ID NO: 8120. 

zinc finger protein 289 


1 ft 

WATERMAN 
SCORE 


% 

IDENTITY 


1116 
1117 
1118 


L40357 
L40357 
A12155 


Homo sapiens 
Homo sapiens 
Homo sapiens 


thyroid receptor interactor 
thyroid receptor interactor 
Human X5L cDNA. 


1697 

509 

404 


91 

inn ™ ' 
85 


1119 


AL1S1542 


Arab i daps is 
thai i ana 


isomerase like protein 


1673 
607 


100 
53 


1120 


AL023754 


Homo sapiens 


dJ272Ll6.1 {Rat 

Ca2+ /Calmodulin dependent 

Protein Kinase LIKE protein) 




98 


1121 


Y57901 


Homo sapiens 


Human transmembrane protein 
ETMPN-25. 


321 


3 6 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 


455 


77 


1123 


AP225418 


Homo sapiens 


lipase 


1531 


97 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


TOOT 


100 


1125 


AL035690 


Hcrao sapiens 


dJ202I21.1 (novel protein) 


952 


100 ! 


1126 


AJ000217 


Homo sapiens 


CLIC2 


1286 


99 


1127 


AB030505 


Mus musculus 


UBE-lc2 


1069 


79 


1128 


Y73i^S 


Homo sapiens 


HTRM clone 14 2783 8 protein 
sequence . 


874 


100 


1125 


Y78941 


Homo sapiens 


Cyclophilin-type pep t idyl 
prolyl cis/trans isomerase 
amino acid sequence . 


8 77 


100 


1130 


AL0235^3 


Homo sapiens 


dJ347H13.4 (novel protein) 


557 


100 


llSl 


Y91945 




Human chape rone protein 6 

\ fUJlr * D } . 


1408 


100 


1132 


Z68197 


Schi zosaccha 

rorayces 

pombe 


yuwiiwc nucicar pore procein 


596 


39 


1133 


Z68197 


Schi zosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


factor 


3597 


100 


1135 


AF079765 


Mus musculus 


enhancer of polycomb 




41 


1136 


M^2419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


melanogaster 


^laLiiiiu aiaocidcea procsin 


1254 


78 


1138 


Y7621B 


Homo sapiens 


encoded by gene 95. 


440 


98 


1139 


W88104 


Homo 
sapiens 


HRABS-2 . 


1065 


99 


1140 


Y13401 


Homo sapiens 


Amino acid sequence o£ 
protein PR0339. 


3979 


98 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product. 


J JUS 


100 


1142 


Y13402 j 


Homo sapiens 


Amino acid sequence of 
protein PRO310. 


1 CCkA 


99 


1143 


G03 875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


oou 


99 


1144 


Y12917 


Homo sapiens 


Amino acid seojuence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 




100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN H0MOL0G 
(PROTEIN DXP34) ) 


1233 


100 


1147 


AL022157 " 


Homo sapiens 


SPIN (SPINDLIN H0M0L0G 
(PROTEIN DXF34 ) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO r 6629. 


370 


99 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








HEAAR60. 






1151 




Rat tus 
norvegicus 


neural membrane protein 35; 
NMP3 5 


1570 


92 


1152 




sapiens 


lysophosphatidic acid 
acy 1 t rans f e r as e-gamraa 1 


1855 


99 


1153 




Homo sapiens 


OJ1191N16.1 (A novel protein 
(translation of the cDNA 
L/R.r Zipbb bA034 6 , Em : AL0a0069 1 J 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


X ** 1 / U3 


Homo 
sapiens 


Human PR0352 protein 
sequence . 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 3117. 


607 


99 


1157 


API 124 4 4 


Lupinus 
luteus 


b-asparaginase 


287 


43 


1158 


AF151848 


Homo sapiens 


CGI -90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83. 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


116 3 


AF113534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 

DKFZp566A094 6, Em:AL050069) ) 


1051 


71 


1166 


ALII 8 501 


Homo sapiens 


CU1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZpS66A0946, Em:AL0SO069) ) 


945 


75 


1167 


AF187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


ABO 194-Jb 


Homo sapiens 


phosphol ipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L03188 


Sa ccharomyce 
s cerevisiae 


putative j 


180 


22 


11 (A 


AFllo 751 


Mus mus cuius 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


lfYi 

AX ' <9 




Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V -alpha -J - 
alpha region 


284 


83 


11 77 


afTi neon 


Arabidopsis 
thai i an a 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


~dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039715 


Caenorhabdit 

\ <o pi prranc 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 

sapiens] \ 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT- 1994 
Human TCL-1 
polypeptide. 


T cell leukemia/ lymphoma 1 


617 


100 
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TABLE 2 



SEQ 
ID 

NO: 


1 ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






[Homo ~ ~ 
sapiens 








1183 


U42841 


Caenorhabdit 
is elegans 


short region oi weaX 
similarity to collagen 


161 


33 


1185 


AJ131613 




di carboxylase carrier protein 


1470 


99 


1186 


L27645 




growth- associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 

ITT UPDM 


636 


100 


1188 


AF217544 


Xenopus 


ornithine decarboxylase- 2 


" 1459 


"60 


1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurit e. outgrowth) 


182 


33 


1190 


AO jOUi 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 




Rattus 

no Lvcy ^ Cu S 


PV-1 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human retal hmln cDUA clone 
vcl6_l derived protein. 


918 


100 


1194 




Ratt.ua 
norvegicus 


stathmin- like-protein splice 
variant RB3 ' * 


1093 


97 


1195 


U3 5244 


Ratcus 
norvegicus 


vacuolar protein sorting 
homo log r-vps33a 


2981 


96 


1196 


I / U<i /U 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD- 017 protein 


912 


47 


1198 


DDI *> A ^ 

At i^b443 


Caenorhabdit 
is elegans 

— , 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB: Z28295) 


460 


39 


1199 


/it *v J *a 


Homo sapiens 


DC12 


1649 


88 


1200 


auUj J. / /a 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 • 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur keratin 


484 


82 


1202 


Z85986 


Homo sapiens 


dJl08K11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 




Rattus 
norvegicus 


recinoi dehydrogenase type I 


890 


52 


1204 




MUS RTUSCUlUS 


jerky 


2235 


76" 


1205 


AB002327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arab idop sis 
thai i ana 


ubiquinone/menaquinone 

biosynthesis 

methyl transf erase-like 


762 


56 


1207 


AL136307 


Homo sapiens 


dJ380B8.2 [Neuritin, a 
protein which promotes 
neunc a oucgrow tn./ 


742 


100 


1208 


AF207989 




orphan G-protein coupled 
receptor 


2326 


100 


1209 
1210 


Z97630 
021549 


T-f n rr>/-\ c Y> "1 one* 

nvjino oapicns 

Mu3 mus cuius 


dj4 66N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
nfjiMi a j. icciiivier taiucyrin 
G))) 

Ac3 9/physophxlin 


181 


44 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1280 
1267 ' 


68 
100 


1212 
"1213 


AF117814 


Mus mus cuius 


odd- skipped related 1 protein 


945 


66 




AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14849 


mus musculus 


meiosis- specif ic nuclear 
structural protein l 


19S0 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


121? 


Z72S10 


caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



S2Q 
ID 

NO: 


ACCESSION 
NUMBER 


| SPECIES 


! DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






is elegans 


protein (Swiss Prot accession 
yk677hll.S comes from this 
gene 






1217 


"249703 


Saccharomyce 
S cerevisiae 


unknown 


134 


22 


1218 
1219 


AC013430 
L1091O 


Arabidopsis 
thaliana 
Homo sapiens 


F3F9.18 

splicing factor 


199 


29 


1220 


Z70750 


is elegans 


similar to vanadate 
resistance protein 

t" A yi cilomKra « *■* _n -g - - - - - 

* anbucuiDr ctllQuS COCH&S £ ]f OCH 
r H 1 c rTOro 


1026 
965 


71 
58 


1221 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 




^ ahi, tmysi protein ^ y- kkn-21 
antigen 


2261 


100 


1223 


J05071 


Bos taurus 


protein gamma -6 3ubunit 


3 56 


100 


1224 


Y73364 




sequence . 


1169 


99 


1225 


AL050170 




hvoothpt" A ra 1 nv/^ t-ai n 
u jrr ut " e, -*>-«* prouein 


714 


100 


1226 ■ 
1227 


X640G2 
X04085 


Homo sapiens 
Homo sapiens 


RAP74 


2661 


99 


1228 


ACT005620 


Mus mus cuius 


skeletal muscle -specific gene 


284G 
1416 


100 
90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 
1231 


X97571 
L08239 


Mus musculus 
Homo sapiens 


HCMV- interacting protein 
located at 0ATL1 


479 
2274 


96 
100 


1232 
1233 


AF121863 
AF121863 


Homo sapiens 
Homo sapiens 


sorting nexin 14 
sorting nexin 14 


1964 
1203 


100 
84 


1234 


&r*f\iA one 


Caenor habd i t 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


ACOo£S"34 


<k*a e n o r h a bet i t 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C <GB:U20162) 


357 


33 


1236 
1237 


Y18101 
AB042646 


Homo sapiens 


macrophage actin-associated- 
tyrosine-phosphorylated 

TGIF2 


1559 
1224 


87 
100 


• 1238 
1239 


AB026264 
AB026264 


Homo sapiens 


IMPACT 


1694 
1123 


100 
100 


1240 


G0O429 


Homo sapiens 


Human secreted protein, SEQ 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


thaliana 




499 


28 


1243 


X76483 


Gallus 
gallue 


Yes-associated protein 
(SSkDa) 


574 


48 


1244 
1245 


AF220186 
AL021453 


Homo sani pnq 


uncharacterized hypothalamus 
protein HT012 

uvjd^iuii. J tfuiAllvfc. protein) 


503 
856 


100 
100 


1246 
1247 


AJ276003 
Y57910 


Homo sapiens 


GARl protein 

Human transmembrane protein 
HTMPN-34. 


1216 
1369 


100 
98 


1248 


AC004874 


Homo sapiens 


similar to N- 

a c e t y 1'ga 1 ac t o s ami ny 1 1 r ans f er a 
se; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


"1250 


Y1314 8 


Rattus i 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specific protein PEP- 
19 


124 J 46 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION ~ 


1 SMITH- 

WATERMAN 
j SCORE 


IDENTITY 


1252 


HT X^O /Jo 


Rattus 
norvegicus 


testis specific protein 


" 771 


83 


12S3 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 68C6. 


j 419 


07 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC41195 1 


831 


78 


1256 
1257 


AB004316 
Z35094 


Bos tauma 
Homo sapiens 


mi t ochondr i al me t hi onyl - tRNA 

transformylase 

SURF- 2 




88 


1258 


Y13362 


Homo sapiens 


Amino sriri QPmion r**» 

protein PR0214. 


J 1354 
2383 


97 
100 


1259 


AC006014 


Homo sapiens 


similar* to pro hranQf^wnin» 
*** kM - t i«w xvf fr Liaiis c orming 

protein; similar to P14373 

(PID:gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


j 469 


100 


1261 


V00507 


Homo sapiens 


COdinCf Seoupnrp r»*F DUPD It ia 

1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X1S443 


Rattus sp. 


gamma -glutamyl transpeptidase 
(AA 1-568) 


697 


32 


1263 


AF173871 
AF 178983 


Mus musculus 
Homo sapiens 


neuronal PAS 3 


j 977 


• 94 


1265 


Y70473 


Homo sapiens 


Ras-associated protein Rapl 
Human cyclic nucleotide- 

3SSOciat"Pr? nrnhp'tn.l t fw*. n 

1) . 


j 433 
2785 


97 
99 


1266 
1267 


Y41738 
AF061346 


Homo 

sapiens 

Mus musculus 


Human PROS41 nrat-pin 
sequence . 
Edpl protein 


I 1622 


1C0 


1268 
1269 


U97006 
AF233582 


Caenorhabdit 
is elegans 
Mus musculus 


C13F10.4 gene product 
GTPase Rab37 


1077 
154 


64 

23 


1270 


AF195951 


Homo sapiens 


68 


942 
3127 


95 1 
98 


1271 
1272 


AL031177 
AP201933 


Homo sapiens 
Homo sapiens 


dJ889M15 .3 (novel nrot-Pinl 

DC11 ~~ 


1150 
650 


55 
100 


1273 
1274 

1275 


AF201933 
AL021710 

AC004449 


Homo sapiens 
Arabidopsis 
fchaliana 
Homo sapiens 


DC11 

Dutative nrotpin 

R33683_3 " 


346 
348 


98 
49 


1276 


Y8429S " 


Homo sapiens 


Human secreted nrotein I 
HL2AG87, SEQ ID NO: 210. | 


556 
1920 


100 
100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 l 
(HYDRL-9) . 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon j 


478 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein j 
PR01344 . j 


1909 


100 


1280 


AF161380 


Homo sapiens " 


HSPC262 " ™j- 


772 


100 


1281 


Y48610 


Homo sapiens 


Human breast tumour- ' T 
associated protein 71. | 


779 


100 


1282 
1283 


AC015446 
AK024432 


Aranidcpsis 
t ha liana 
Homo sapiens 


Similar to AIG1 protein | 
FLJ00022 protein [" 


4 06 


35 


1284 
1285 


W96"1S3 
AJO01O19 


Homo sapiens 
Homo sapiens 


Human FADD- interacting j 
protein (FIP) . [ 
ring finger protein | 


4 03 1 
1825 


3 5 
81 


1286 


AE0C3823 


Drosophila 
melanogaster 


CG13178 gene product T 


1301 
195 


100 
29 


1287 


AF178632 


Homo sapiens 


FEM-l-like death receptor j 
binding protein | 


32~6l 


100 


1288 


AC0G6033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2.135214) 


1195 


100 


1289 
1290 


AC006033 
AB023811 j 


Homo 
sapiens 
-tomo sapiens 


similar to MLN 64; similar to 

138027 (PlD:g2135214) 

fU3A — 1~ 


668 
351 


93 
54 
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TABLE 2 



SEQ 
ID 

KO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 


* 

IDENTITY 


1291 
1292 


Z73424 


Caenorhabdit 
is elegans 


~ C44B9.1 — 


235 


36 




Y94371 - 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF130425 


Homo sapiens 


retinoblastoma -associated 
protein RAP140 


4 89 


29 


1294 
1295 


G03856 
AF133670 


~ Homo sapiens 
Mus rausculus 


Human secreted protein, SEQ 
ID NO: 7937. 

ARL-6 interacting protein-2 


538 
367 


99 
51 


1296 
1297 


AJ249735 
X57560 


Homo sapiens 

Escherichia 

coli 


claudin-6 
pspE protein 


1142 
535 


100 
100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine -rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


324 


29 


1300 


AB024523 


Homo sapiens 


basic Jcruppel like factor 


1206 


100 


1301 
1302 


X55999 
AF007151 


Homo sapiens 
Homo sapiens 


protein 
unknown 


737 


99 


1303 


X52904 


Escherichia 
coli 


open reading frame (AA 1-^5) 


1481 
359 


100 
100 


1304 
1305 


U19577 
AF266508 


Escherichia 
coli 

Mus musculus 


NELF protein 


242 


93 


1306 


Y57901 


Homo sapiens 


numdn l. t cinaiueuiairaiie protsin 
HTMPN-25. 


1409 
932 


97 
100 


1307 


U58750 


Caenorhabdit 
io elegans 


e »*n , A>±eiA to biie roicocnonctrxa i 
carrier family 


365 


54 


1308 
1309 


AF044774 
AL078593 


Homo sapiens 
Homo sapiens 


breakpoint cluster region 
protein 2 

dJ21CBl.l (KIAA0680) 


2681 
267 


99 
34 


1310 
1311 


X82693 
Z82263 


Homo sapiens 
Caenorhabdit 
is elegans 


C47A4.1 


620 
283 


96 
35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 
1314 


Y41763 
AF196972 


Homo 
sapiens 
Homo sapiens 


Human PR0938 protein 
sequence . 
JM24 protein 


1636 


100 


1315 


AF0533S6 ' 


Homo sapiens 


insulin receptor substrate 
like protein 


2239 
228 


100 
97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF133127 


Gallus "~ " 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 
1320 


AP153127 
XS6932 


Gallus 
gallus 

Homo sapiens 


SAPK interacting protein 
23 kD highly basic protein 


1651 


86 


1321 
"1322 


AF174S05 


sapiens] 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 P- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 j 


1044 
467 


100 
70 


~1323 


M61732" 


rrypanosoma 
cruzi 


neuraminidase 


214 


24 




Y17013 


porcine 
endogenous 


pol ; 


504 


S4 
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TABLE 2 



j SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 






retrovirus 








1324 


AL133655 


Axabidopsis 
thaliana 


putative protein 


1174 


37 


1325 


AL13 8655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


1326 


Atl3321S 


Homo sapiens 


DA108L7.2 <novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 i 


1327 


AF161541 


Homo sapiens' 


HSPC056 


13S7 


99 


1326 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 

1332"*' * 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc- finger pro-ein ZBRK1 


411 


91 


1334 


Z82271 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KI?4 comes from 
this gene 


578 


44 


1335 


AE000810 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


290 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus mus cuius 


protein phosphatase 


378 


84 


1338 


U64856 


Caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


' AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-ll protein 


204 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 6 8398- 
67H81 


289 


45 


1342 


[ AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 

(PID:g4650844) 


894 


35 


1345 


AF2574g6" 


Homo sapiens 


N-acetylneurarainic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2 -like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


1117 


100 


13S1 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP -14 


613 


100 


1354 


ACO0S328 


Homo sapiens 


R26660_l, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1 CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III j 


1876 


64 


1359 


AF217188 


Mus mus cuius 


YIP1B 


801 


63 


1360 


ACQ 743 31 


Homo sapiens 


2NF234 


3869 


100 


1361 | AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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TABLE 2 



ID 
NO: 




SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 








element binding and beta 
transducin family proteins 






1362 


248475 


rfomo sapiens 


glucoicinase regulator 


3160 


99 


1363 


248475 


Homo sapiens 


giucokinase regulator 


2682 


97 


1364 


API QRTCA 


Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MKGT1 
protein 


205S 


99 


1365 


API 1 A £/l 0 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


FRO0915 


"581 " 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22557)) 


2581 


99 


13 68 


Y3 4124 


Homo 

sapiens 


Human potassium channel 
K+Hnovl5 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 




At U(Jo22U 


Bacillus 
subtilis 


xtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
in cod on) 


5908 


99 


1372 


1 QQ H/IQ 


Homo sapiens 


dJ4 08N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 




Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


A. J / O 


U534 45 


Homo sapiens 


DOCl 


1645 


"46 


X J /o 


Al»117337 


Homo 
sapiens 


OA3 93J16.1 (zinc finger 
protein 33a (KOX 31)) 


250 


60 


13 77 




Homo sapiens 


R2666 0_l, partial CDS 


1126 


100 


1378 


U35113 


Homo sapiens 


metastasis -associated gene 


1823 


69 


13 79 


Jjlb J 13 


Caenorhabdi t 
is elegans 


putative 


858 


58 


1330 






Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


[ANKttZN 


959 


97 


1 1 pi 

JL J OJ 


Ar 23 76 76 


Mus mus cuius 


G beta- like protein GBL 


1721 


96 


1 ^ OA 
-1-3 


AF23 76 76 


Mus mus cuius 


G beta- like protein GBL 


1043 


70 


1 J ob 


Y58793 


Homo sapiens 


Human calcium regulatory 
prot e in CaREG- 1 . 


715 


100 


1386 


AF212162 


Homo sapiens 


nine in 


10369 


99 


13 87 


at mi CDC 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004890 


Homo sapiens 


similar to zinc ringer 
proteins; similar to BAA243 80 
>W063l6 W06316 03-OCT-1996 
27-APR-199S TRP-1 protein. 


542 


86 


1389 


API 87989 




zinc finger protein ZNP223 


2665 


99 


1390 • 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


*»* *. U / O J*» 


Homo sap i e ns 


PIST 


1410 


97 


1392 




Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X90840 i 


Homo sapiens 


axonai transporter of 
synaptic vesicles 


4584 


99 


1394 


JX K*n "7<C T /i Q 
u / D £ *i y 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 




Homo sapiens 


.Human secreted protein, SEQ 
ID NO: 63 05. 


299 


75 


1396 


AC004809 


Arabidopsis 
t ha liana 


Similar to 


lift 

J. J V 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


65 


1399 


AL133396 


Homo 
sapiens 


dJi068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72 . 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
a cerevisiae 


putative HMG box 


164 


27 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESS ION 
NUMBER 


SPECIES 


Utu(_K 1 r* 1 lUSt 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1403 


Y79222 


sapiens 




2842 


100 


1404 


X81058 


Mus nus cuius 


tex2Si 


1010 


99 


1405 


AB012084 


Mus mus cuius 


ITM 


174 


29 


1406 


A3 03 02 51 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 


AJ010S85 


Rattus 
rattus 




2684 


99 


1408 


X75760 


melanogaster 


LRR47 


364 


29 


1409 


D76618 


Mus musculus 




804 


48 


1410 


AC00S578 


Homo sapiens 


f4UBO/_i, parciai CUb 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein. 


360 


100 


1412 


' X01563 


Escherichia 
coli 


v-yj.a/ iaa i-i/»j 


911 


100 


1413 


W78279 




Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sanipn«s 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation ractor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L-Jcynurenine/alpha- 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Vfl QQ/1 c 
I 3 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
aura t us 


guanine nucleotide-binding 
protein beta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


bA465Ll0.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


VQ OA O 

X 99426 


Homo sapiens 
, 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 


152 


29 


1422 




Homo sapiens 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 

TT> MP* . C*5 
±U DiU: DZ . 


4039 


99 


1423 


AF177388 " 


"komo 

sapiens 


cancer-amplif iec3 
transcriptional coactivator 
ASC-2 


10748 


99 


1424 


Y48517 


Homo sapiens 


Huraan breast tumour - 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 


89 


1426 


AF208848 


Homo sapiens 


BM-006 


853 


79 


1427 


AF112886 


3os taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


AF161S34 






2853 


78 [ 


1430 


AF125043 


Mus musculus 


bisphosphate 3 • -nucleotidase 


275 


30 


1431 


Y66718 


sapiens 


Membrane -bound protein 
PRO1106. 


1886 


100 


1432 
1433 


AF193613 
AB044560 


Homo sapiens 
Mus musculus 


ceii recognition molecule 

Caspr2 

Gliacolin 


568 


100 


1434 


R9930O 


Homo flan i f*n q 


* nerve protein, 
cdcmtates regeneration or 

iici vc UCl 1 9 ■ 


192 
707 


34 
51 


1435 


AF220530 


Homo sapiens 


myo- inositol 1 -phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator- 3 


1282 


100 


1438 


Y30311 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long iso£orm 


3083 


100 


1441 


AF21913 8 


Homo sapiens | GGA3 long isoform 


3346 


100 



ISO 
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TABLE 2 



SEQ 
ID 
NO: 






DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1442 


' A3039669 


Homo Q^ni^ns 




1944 


100 


1443 


AF237711 


Drosoohi 1 a 
melanogaster 




191 


27 


1444 


AJ011896 


Homo sapiens 




439 


39 


1445 


X73874 




jjuusuiiuiyidae Kinase 


6233 


98 


1446 


AF214114 


Homo s*4oi«*nQ 


V\Y*o;* of narni «v< — . — 4**^** _~ ^ ^ J5j 

aiedSL cd rc inonia - associaceu 
antigen BCAA 


3999 


99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


1448 


AF003136 


Caenorhabdit 

1 Q 1 pn^riQ 


contains weak similarity to 
all APlr-DinQing motit 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY -REN- 50 antigen 


1184 


' 39 


1450 


Y95004 


Homo sapiens 

- -75 : 


Human secreted protein 
vc54_l, SBQ ID NO: 48. 


S85 


100 


1451 


AF1D7201 


riorno sapiens 


ataxia 2-binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxia 2-binding protein 


456 


78 


1453 


23 9011 


Mus musculus 


DMR-N9 


8 82 


56 


■ 1454 




Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEITSEMBL -Heidelberg .DE 


510 


28 


1455 




Homo sapiens 


dJ564M11.3 (similar to 
sia lyl t ranf erase > 


1356 


100 


145S 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


Ar l^ij^Q 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 1 


1459 


AF242552 


Gall us 
gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


64 


1461 


AB02^8 


Muc musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match CO ESTs 243979 
(NID:g573097} . R19S99 
(NTD:g774333) 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID:gS73097) , R19S99 
(NID:g774333) 


869 


98 


14& 


U32743 


Haemophilus 

influenzae 

Rd 


fucose operon protein (rucO) 


315 


50 




IU3022 


Homo sapiens 


Not56-like protein 


2342 


100 


1467 


AC003034 


Homo sapiens 


Homo log of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 
1 


ribulose-i, 5-bisphosphate 
carboxylase/oxygenase small 
subunit N-methyltransferase I 


333 


2<j 


1469 


Y57930 


Homo s ap i ens 


Human transmembrane protein 
HTMPN-54 . 


1053 


100 


1470 


AF032666 


Rattus 

nnr^fAn^ rue 

iiorvegicus 


rsec5 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protem-17 (MECHP-17) . 


452 


74 ' 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


14 73 


APT 777 QO 


Homo sapiens 


genethonin 3 


4026 


98 


1474 


S45936 


Homo sapiens 


HTSl 


1101 f 


50 


1475 


YB^241 " 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 155 . 


1379 


98 


1476 


AJ010317 


Fugu 

rubripes 


Sand 


1278 


58 


1477 


U42831 


Caenorhabdi t 
is elegans . 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 1 


100 


1480 


U10536 


Pan pamscus 


MHC. class I A 


675 


84 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1481 


""AL078599 ' 


Homo sapiens 


dJ9SlC6.l (novel protein 
similar to C eleaant; 
F55A12.9 (Tr:P91086)) 


1274 


65 


1482 

1483 
1484 


Z98977 

AB005662 
AI*050120 


Schizosaccha 

rorayces 

pombe 

Mus raus cuius 
Homo sapiens 


putative vacuolar protein 
JNK/SAPK-associated probein-1 


256 
" 4968 


29 
92 


1485 
1486 


M27878 
"Y69161 


Homo sapiens 


DNA binding protein 
partial protein kinase. 


716 

1006 

575 


100 

53 

99 


1487 
1488 


" X84156 
AF038963 


s cexevisiae 
Homo sapiens 


ATK1 ' — 

RNA hpl i caeA 


341 


29 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
y!<30b3.5; coded for by C. 
elegans cDNA yk30o3.3 


446 
620 


34 
42 


1490 


AE000989 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydra taae (fad- 4) 


533 


46 


1491 


M80633 


norveglcus 


aucuyiyi C/CiaSc type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 

sequence . 


3<Jl3 


99 


1493 
1494 


Y17220 
AF133670 


Homo sapiens 
Mus musculus 


Human secreted protein (clone 
f j 283-11 ) 

ARL-6 interacting protein-2 


452 


37 


1495 
1496 


Y94897 
Ai, 049699 


Homo 
sapiens 
Homo sapiens 


Human protein clone HP10S74. 
au tnz j .z \novei protein) 


701 
1371 

1550 


"97 
100 

100 


1497 
1498 

14 99 


AF037447 
AL445067 

AB039947 


Homo sapiens 
Thermoplasma 
acidophilum 

Homo sapiens 


riboscmal S6 protein kinase 
putative, target YPL207w of 
the HAP2 transcriptional 
complex related protein 
XllL-bmding protein 51 


2427 
269 

227 


100 
35 

"36 


1500 
1501 

1502 


AJ277750 
AL0S0333 

AF179896 


Homo sapiens 
Homo 
sapiens 
Homo sapiens 


UBASH3A protein 
dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 
Anua iiuinBUUUA protsin meis2JD 


3509 
2439 

1140 


100 
100 

100 


1503 
1504 

1505 
1506 


AF178949 
Y5300S 

X82494 
X98296 


Homo sapiens 
Homo sapiens 

Homo sapiens 


TALE homeobox protein Meis2a 
nutuan accreteu proccin clone 
pn749_8 protein sequence SEQ 
ID NO: 16. 
fibulin-2 


1177 
1442 

3580 


100 
99 

99 


"1507 

1508 


AL034548 
Y76144 


Homo sapiens 
Homo sapiens 


dJ1103G7.6 (novel protein) 
Human secreted protein 
encoded bv opnp ?l 


783 

1098 

1736 


42 

100 

100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus . " 
protein HT0 08 


1181 


98 


1510 
"1511 


U64$01 


Caenorhabdit 
is elegans 


next cosraid 


415 


58 


"T512 


AL356192 


Neurospora 
crassa 


related to mdmi nmf pi n ~ 


196 


29 


1"S13 


D17629 
AF168717 • 


Homo 
sapiens 
Homo sapiens 


sulfate sulfatase (GALNS ) 
x 009 protein 


1829 
694 


100 

99 1 


1514 
1515 


AJ243531 
ACD03672 


Homo sapiens 
Araoidopsis 
thai i ana 


nM15 protein 

putative C3HC4-type RING zinc ~ 
finger protein 


735 
407 


100 
30 


1516 - 


AF115435 - 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 

1518 " 


AF00314 0 


Caenorhabdit 
Is elegans 


C44E4.5 gene product 


274 


31 


T5l9 " 


AB002S84 
WL121764 i 


Rattus 1 
aorvegicus 
Schizosaccha ~ 


De ta- alanine -pyruvate 
amino t rans f e r ase 

feast atpl2 protein precursor : 


2238 

270 ; 


92 
30 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESS ION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






romyces 
pombe 


homo log 






1520 
1521 


AF255910 
D31764 


Homo 
sapiens 
Homo sapiens 


vascular endothelial 
junction-associated molecule 
KXAA0064 


547 




1522 


Y66634 


Homo 

cam" on c* 
sapiens 


Membrane -bound protein ~ 
rKUlau . 


170 
985 


27 
100 


1523 


Y94450 


~ Homo sapiens 


Human inflammacion associated 
protein 


250 


43 


1524 
1525 


AC0001O7 
AF109377 


Arabidopsis 
thaliana 
Mus musculus 


F17F8.22 
IdlBp 


277 
1277 


3*7 

83 


1526 
1527 

1528 


AL031427 
Y08135 

AK024423 


Homo sapiens 
Mus rausculus 

Homo sapiens 


dJ167A19.4 (novel protein) 
acid sphingomyelinase- like 
phospho di es t e r a s e 
FLJ00012 protein 


1432 
1496 

£ll 


99 

f 2t 


1529 
1530 


" AF1S4502 
AF205598 


Homo sapiens 
Homo sapiens 


quiescent cell proline 
dipeptidase 

transposase-1 ike protein 


679 
1368 


100 
100 

10 0 


1531 
1532 


■"AF251039 
W74805 


Homo sapiens 
Homo sapiens 


putative zinc tinger protein 
Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1420 
493 


50 
57 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; " 
RanBP6 


5707 


99 


1534 
1535 


AC007190 
AB027564 


Arabidopsis 
thaliana ■ 
Homo sapiens 


F23N19.9 
DINB1 


374 
4482 


37 
100 


1536 
1537 


Y36178 
Y50907 


Homo sapiens 
Homo sapiens 


Human secreted protein 
Human fe»t"al hrain rnNi «i AT1a 

vb3 l derived protein. 


377 
3 593 


87 
99 


1538 
1539 

ic, n 


AF017368 
AF266756 


Mus musculus 
Homo sapiens 


faciogenital dysplasia 
protein 2 

sphingosine kinaac 


177 
2011 


47 
99 


lb4 0 
1541 


248804 
AF000195 


Homo sapiens 
Caenorhabdit 
is el eg an s 


OA1 

Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value=l . 9e-0S, 
N=l 


2238 
379 


100 
42 


1542 

1543 
1544 


Y71159 

X76092 
AB015330 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 

DNA binding protein RFX3 
HRIHFB2007 


9415 
3327 


99 
100 


1545 
1546 

1547 


AF198487 
AF016417 

X55885 


Homo sapiens 
Caenorhabdit 
is elegans 
Homo sapiens 


transcription factor LBP-ib 
Similar to BZIP transcription 
factor 

KDEL receptor 


631 

2822 

518 

lib* 


50 
100 

42 ~ 


154 8 
1549 


AB035495 
AL021707 


Carassius 
aura t us 
Homo sapiens 


ubiquitin-activating enzyme 
51 

dJ508I15.4 (KIAA0668) 


836 


100 
42 


1550 


AJ223978 


Bacillus 
sub til is 


Yvqx protein 


3688 
292 


100 

T5 


1551 " 


AF145615 


Drosophiia 
raelanogaster 


BCDNA.GH03377 


822 


44 


1552 
1553 


AL157734 
AF079S27 


Schizosaccha 

romyces 

porabe 

Mus musculus 


putative mannosyl transferase 
involved in N-glycosylation 

IER5 


435 


37 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


691 
1099 


63 
88 


1555 


Y44722 


Homo sapiens 


Human iraraune system molecule, 
ISMO-3. 


1780 ! 


99 


1556 
"1557 


AF116553 
Y71056 


Drosophiia 
melanogaster 
Homo sapiens 


antennal- specif ic short-chain f 
dehydrogenase/reductase ( 
luman membrane transport 


277 
1975 


32 
99 
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TABLE 2 



SEQ 
ID 
NO: 


NUM3ER 




DESCK i PT xuiN 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 








protein, wi'KF-i. 






1558 


Y71056 


U/Mim oanl awe 
HOulO SapicDB 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, pttxp-i . 


1894 


97 


1560 


AF092050 


Mus raus cuius 


beta-1, 3-N- 

acetylglucosaminyl transferase 


262 


44 


1561 




Homo so. oi en 3 


ou303K2Q . 2 (acrosomal protein 
ACRS5 (similar to rat sperm 
antigen 4 (S?AG4))) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 







Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 


1564 




Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AV.UU3JU O 


Homo sapiens 


R27216 1 


919 


82 


1566 




LuenorndZXii t 

is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20.6, E-value=l . 9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD-repeats protein 
beta-TRCP2 isoform C 


2879 


100 


lSo 8 


U'i y 1 / J 


Mue rausculue 


truncated form of Soxl7 


1047 


78 


1559 


AX025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mu 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-l 


2388 


100 


1572 


AE003831 


Drosophila 
melanogaster 


CG18445 gene product 


180 


31 


1573 


AF074603 


Streptomyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1 C*7A 
±3/4 




Caenorhabdit 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64S78 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogaster 


Diablo 


421 


54 


1 g 7 Q 


f>nno7t; 

L>(juy /b 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1 57Q 


Ar 24 o744 


Crypt osporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585I14 . 2 . (novel protein 
(translation of cDNA 

r m . a tfn n n o i q \ \ 
dm . /ii\.u u u <; Xj } / 


663 


100 


1581 


AF041853 • 




jtmesin. LcnuijLy uiemDer protein 
KIF3A 


345 


33 


1582 


AF025441 




u F tt liiuctdtuAiig protein uifa 


1 T no 


100 


1583 


AE001B03 


Thermotoga 
maritirna 


glycerate kinase, putative 


349 


34 


1584 


AF2522B3 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 


icucine-ricn repeat 

i-x. aiitjineiiujircine protein rLtKi-L 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 i 


97 


1587 


X79440 


Homo sapiens 


NADP+- dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 j 


1589 


AF169803 


Homo sapiens 


flavohemoprotein b5+b5R 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 






SMITH- 
WATERMAN 


t 

IDENTITY 






pombe ~ ~ 








1595 " 


W78324 


~ Homo sapiens 


protein encoded by gene 81. 


IJio 


98 


1596 


Y94906 


Homo sapiens 


rb649__3 protein sequence SEQ 
ID NO: 18. 


— TTxc 


98 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx25 


1408 




1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 




1600 


X82200 


Homo sapiens 


gpStaf 50 


2305 


100 


1601 


Y00876 ■ 


sapiens 


sequence . 


1 1 AO 


98 


1602 


AJ223351 


Homo sapiens 


HIRA-interacr 1 r\a nont-D-in t 


2821 


99 


1603 


AJ222801 


Homo sapiens 




22 68 


99 


1604 


AJ222801 


Homo sapiens 


neutral sohinorvnu^l 


1601 


99 


.1605 


AF185576 


Mus raus cuius 


POZ/zmc finger transcription 
factor ODA—8 


3435 


97 


1606 


AF093744 






131 


100 


1607 


A12142 


synthetic 
construct 


IFN-pseudo- omega 2 


800 


98 


1608 


Y57949 




— . 

Human transmembrane protein 
HTMPN-73. 


1868 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


6 81 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


376"S 


100 


1611 


Y08200 




rab geranylgeranyl 
tianaictasc 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Wnrno eani ana 


WAun- eye ocnrome-o 5 reductase 


1607 


100 


1615 


Y15521 


nuuiu uapienu 


start position 1 


3150 


97 


1616 




norveoi rn q 


Castration induced prostatic 
apoptosis related protein-l, 
(CIPAR-1) 


890 


62 


1617 


X58079 




oiwu axpna protein 


481 


100 


1618 


Y«678 


Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide • 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 




288 


100 


1621 


AJ007509 




bad 3jiu;d- aesoci3csu protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 

s fulcridu^ 


A. fulgidus predicted coding 

-i.fciyj.QH nf U03j 


240 


36 


1624 


AL355013 


romyces 
pombe 


mi tocnondriai carrier protein 


403 


34 


1625 | 


Y66746 


Homo ; 
sapiens 


Membrane -bound protein 
PR01198 . 


1184 


100 


1626 


D90053 


Sus scrofa 


destrln 


Dbj 


100 


1627 


Y35954 


Homo sapiens 


UAUCitucu 41UIUCM1 D6W4CLCU 

protein sequence, SEQ ID NO 
203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 {novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


286 


68 


1630 


AF017096 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03c 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


76*3 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidcpsis 
thaliana 


Contains weak similarity to 
GATA-6 DMA- binding protein 
gb|H36135 f gb|Z26200 come 
from this gene. 


143 


38 



185 
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TABLE 2 



SEQ 
ID 
BO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


t 

IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E iategrase 


411 


90 


1636 


Y50943 


COuO SapiCilS 


Human adulc brain cDNA clone 
ve8_l derived protein. 


1126 


95 


1637 


AF134593 




L~pipecolic acid oxiaase 


2066 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunic 


1948 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l_l protein sequence SEQ 

T T\ \T/"\ ex 

ID NO : 90 . 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Drosophila 
melanogaster 


WDS 


358 


" 26 


1642 


Mia Jbl 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein- 2 (MECHP-2) . 


1352 


100 


1644 


Ar 1 JbbZQ 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 


2675 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42. 


1156 


100 


164 6 


X67155 


Homo sapiens 


mitotic klnase-like protein-l 


4456 


99 


i ca n 
1647 


MS3180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


1648 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


656 


99 


16S1 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 


AL161576 


Arabidopsis 
thai i ana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


OJ184J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


mycM 


297 


32 


lbs/ 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 


2251 


99 


1 C^Q 
lO 3D 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


ubiquitin-specif ic protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains) 


320 


34 


1662 




Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


~16^4 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


s cerevisiae 


unknown 


138 


2* 


1666 


AP177385 


Homo 

oani one 


cytochrome c oxidase assembly 
protein i so form 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191 1 


1581 


47 


1668 


S67513 


Boma 
disease 
virus BDV, 
WT-l, Halle 
Bl/91, horse 
brain, field 
isolate, 
Peptide, 370 


p40 


397 


43 



186 
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TABLE 2 



-SEC- 
ID 
NO: 



1669 



1670 



ACCESSION 
NUMBER 



299753" 



G03130 



aa 

Schizosaccha 



roiayces 
pcmbe 



Homo sapiens 



DESCRIPTION 



putative NOLl-NOP2-sun family 
nucleolar protein 



Human secreted protein, SEQ 
ID NO: 7211. 



SMITH- 
WATERMAN 
SCORE 



569 



427 



IDENTITY 



47 



97 



Gallus 
gallus 



cardiac muscle tensir 



1185 



54 



1672 
1673 



AF174482 
Y51946 . 



Horao sapiens 



polycomb 3 



200S 



99 



Homo sapiens 
Homo sapiens 



Human 18 
fragment 
EXP35 



1 homo log protein 



233 



29 



1676 



Y2S712 



Homo 
sapiens 



Human protein clone HP10563 . 



152 
109" 



Homo sapiens 



Human secreted protein 
encoded from gene 2. 



3043 



29 
30" 



99 



1?7T" 



1679 



1680 
1681 



Homo sapiens 



AF163151 



Homo sapiens 



Human secreted protein 
encoded from gene 2 . 



1580 



AF1631S1 



Homo sapiens 



AK0244S3 
AF019236 



Homo sapiens 



dentin sialophosphoprotein 
precursor 

dentin sialophosphoprotein 
precursor 



ETjJ00045 protein 



170 



170 
1349" 



91 



17 



100 



1682 



1683 



1684 
1685 



Dictyosteliu 
m discoideura 



AJ243459 



Leishmania 
major 



TipD 

proteophosphoglycan 



153 



Z69369 



X94 91Q 

AF285475" 



schizosaccha 

romyces 

pombe 



putative GTP-binding protein 



560 



Homo sapie ns 
Takifugu 



ERp28 



1334 



26 



46 



rubripes 



retinitis pigmentosa GTPase" 
regulator-like protein 



196 



19 



1687 
1688 



AF191298 
AJ275986" 



AJ275986 
X07311 



Homo sapiens 
Homo sapiens 



Homo sapien s 
Drosophila 



vacuolar sorting protein 35 



transcription factor 



4087 



transcription f actor " 
heat shock protein 



2958 



1886 



100 



100 
88 



1696 



1697 



1698 



AF240463 



melanogaster 



138 



Rattus 
norvegicus 



LISl-interacting protein 
NUDE1 



1383 



1699 
1700 



AK000193 



AB041035 
AB041035 



AF025772 
Y44676 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 
Homo sapiens 



methyl transferase 1- v ariant 2 
unnamed protein product 



kidney superoxide -producing 

NADPH oxidase 

kidney superoxide -producing 
NADPH oxidase 



1060 
3122" 



C2H2 zinc finger prote in 
Human ARF- Related Protein- 1 
(HARP-l) , 



2181 



488 



43 



83 




100 



100 



100 



54™ 



938 



1701 
1702" 



AX022407 
AB024574 



Homo sapiens 



unnamed protein product" 
GTP-binding like protein 2~ 



315 



97 



AF05507B 



1704 
1705 



AT198092 
AE003573 - 



homo sapiens 
Homo sapiens" 



Kus mus cuius 

Drosophila 

melanogaster 



zinc fing er protein 42 
RP42 



1172 



421 



1057 



52 



77 



CG12474 gene product 



161 



1706 



AB036345 



1707 
1708 
-C709" 



Y559-7 

U27121 

U91710" 



Drosophila 
melanogaster 



Homo sapiens 



Danio rerio 
Arabidopsis" 



aquaporin 

Human STLK2 protein. 



16T 



G12 

putative protein" 



2146 

212 

505— 



33 



100 



47 . 

50" 



187 
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TABLE 2 



SEQ 
ID 

NO: 

1710 


1 ACCESSION 
NUMBER 

j B01311 


SPECIES 
thai i ana 


Dr^JCRIPTION - . 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1711 
1712 


j U40750 
j AJ011118 

I AF2S5303 


Mus nius cuius 

l'iU£» UluSCUiUS 


auudij fi\u<^j. polypeptide- 
formin binding protein 30 
skeletal muscle and cardiac 


1649 
4561 
1490 


| 97 

1 85 
I 89 


1713 




sapiens 


membrane -associated nucleic 
acid binding protein 


4416 


99 


1714 


AF255303 


saDiens 


membrane -associated nucleic 
oinaijiy procein 


2960 


100 


1715 


U08227 


Ra c bus 
norvegicus 


«.d&- reiaceci protein 


511 




1716 


AF168795 


Rat tus 
norvegicus 




1129 


J 44 


1717 


JAF196304 


Homo cani pnc 


SUMO-l- specific protease 


5804 


99 


1718 


| AL355737 


Homo sapiens 


HMG20A 


1782 


| 100 


1719 


1 AB029333 


roretzi 


nri*is'r-i 


1069 


46 


1720 


AF071317 


Mus reus cuius 


C0P9 complex subunit 7b 


1297 


97 


1721 




Homo sapiens 


heyjj protein 


1681 


99 


1722 


G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063. 


718 


100 


1723 


AL032643 


Caenorhabdi t 
is elegans 


similar to Oncharacterized 
protein family UPF0034, 


825 " 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 60S3. 


586 


92 


1725 


Y94441 


sapiens 


Human Adipose Specific 
Protein 1. 


1231 




1726 


AP2S5443 


Homo sapiens 


CGI-201 protein 


4397 


99 


1727 
1728 


AF1B3426 
D10884 


Homo sapiens 
Bos taurus 


HT004 protein 
neurocalcin 


1810 


99 


1729 


Z18529 


Gallus 
gallua 


tensin 


1002 
1411 


84 


1730 
1732 


Z73423 


Caenorhabdi t 
is elegans 


cDNA EST EMBL: 214908 comes 
from this gene-cDNA EST this 
gene 

PRO0105 ~ 


233 ~~ 




1733 j 


AJ277724 


Homo sapiens 


hi stone deacetylase 8 


470 

2015 ""'] 


30 
100 


1734 
1735 j 


G04050 
D45913 


Homo sapiens 
Mus musculus 


Human secreted protein, SEQ 
ID NO: 8131. 

leucine- rich-repeat protein 


503 

3531 1 


95 


1736 
1737 | 


AF096709 

& en qci on 


Drosophila 
virilis 
Homo sapiens 


failed axon connections 
protein 

dynactin p62 subunit 


276 


94 

32 


1738 
1739 1 


L15314 


Caenorhabdi t 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


2417 j 
206 


99 
37 






Listeria 

monocytogene 

s 


phosphadidylinositol specific 
phoapholipase C 


134 


27 


1740 | 


AL031656 


Homo sapiens 


dJ3lOOl3.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 j 


Y35924 


Homo sapiens " 


Extended human secreted 
protein sequence, SEQ ID NO. 
173. 


1013 | 


99 


1742 j 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 




1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1932 j 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1854 


61 


1745 
"1746 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 j 


70 


1747 j 


Y99372 

1T94294 ""I 


ttomo sapiens 
locno sapiens 


Human PRO1430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 
*uraan coenzyme A-utllising 


1332 

342 j : 


99 
LOO 



188 
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TABLE 2 



ID 
NO: 



1748 
1749 



1750 



1751 
1752 



1753 



1755 



1756 



1757 



'ACCESSION 
NUMBER 



AX024436 ' 



AE000877 



AF101361 



Y15067 
Ar*251038 



AC003093 



X69099 



AL0497S5 
AL0313S3 



SPECIES 



Homo sapiens 



Metnanobacte 
riura 
thermoautotr 
ophicum 



Drosophila 
mslanogaster 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



ABQ40672 



1758 
1759 



1760 
17S1 - 



AL022238 
AF117S53 



"V12065 " 
AL049712 



Homo sapiens 



Homo sapiens 
Homo sapiens 



Homo sapiens 
Homo sapiens 



Homo sapiens 
Homo sapiens 



DESCRIPTION 



enzyme CoAEN-2. 
yi>J00026 procein 



conserved protein 



Abnormal X segregation 



2NF232 



GAP- like protein" 



QXYSTEROL - B I NDING PROTEIN; 
45% similarity to P22059 
(PID:gl29308) 



165kD protei n 
C3J622L5.3 (novel pr oteinT 
CU733D15.1 
protein) 



(zinc-finger 



UDP-GalNAc: polypeptide N- — 
acetylgalactosaminylt ransf era 



SMITH 
WATERMAN 
SC0R2 



1619 



231 



193* 



889 



822 



352 



5703 



1039 



se 



dJ1042KlO,4 (novel pro-ein)" 



double homeobox protein 



hNop5S 



276T 
2020 



776 



375 



2959 



IDENTITY 



100 



36" 



33 



100 



100 



57 



99 



TOT 



100 



99 



43 
54 



OJ686C3.2 (nucleolar protein 
hNop56) 



2595 



99 



1763 



1764 



1765- 



TtTF" 



1767 



AF169017" 
U91541 



Homo 
sapiens 



Homo sapiens 



Gene product with similarity 
to dyne in beta subunit 



1542 



Homo sapiens 



Tormiminotranaf erase 
cyclodeaminase 



AB013365 
738421 



Bacillus 



haloduxans 



Homo sapiens 



human formiminot ransf erase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 
YlqF 



877 
596" 



350 



AC009176 



Arabidcpsis 
thaliana 



Human secreted protein 
encoded by gene No. 36. 
putative ribulose-l, 5- ~ 
bisphosphate 

carboxylase/oxygenase small 
subunit N-methyl transferase I 



145" 



2lT 



51 



100 



100" 



34~ 



71 



27" 



AK000647 
AJ238982 



Homo sapiens 



unnamed protein producF 
VNN3 protein 



737 



1770 



1771 



1772 



1773 



1774 



1775 
1776 



1777 



1778 
1779 



1780 



1781 



U73522 



Homo sapiens 



U89435 



Komo sapiens 



S70011 



AL035086 



Rattus ep. 



Y99426 



AF11Q330 
AJ269529 



Z81S79 



AY007239 
AL109608 



AF25426 0 



L07924 



Horaoaapieng 
Homo sapiens 



Homo sapiens 



Homo sapiens 



caenorhabdit 
is elegans 



Homo sapiens 



scnizosaccha 
romyces 
pombe 



Homo sapiens 



Mus rausculus 



AMSH 
unknown 



tricarboxyla 



_ ate carrier 
dJ44A20.2 (novel protein) 



Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 



glutaminase 



glycerol 3 -phosphate permease 



cdna EST y*75£l.S comes from" 
this gene 



monooxygenase X 



oxysterol-binding protein" 
family 



tuftelin l" 



2665 



1214 



829 



1604 



2036 



1057 



3146 
278T 



23T 



1875" 



644 



guanine nucleotide 
dissociation stimulator 



1729 



247 



99 



86 



9T 



100 



100 



100 



31 
99" 



38 



100 
"S0~~ 



Homo 
sapiens 



ral guanine nucleotide 
dissociation stimulator 



142 



189 



49 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUK3ER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


% 

IDENTITY 








glucuronidase exon 11 homo log 







TRADOCS:I416280.!(%CT40!!.DOC) 



190 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


2 


BLO0240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.250e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 8.08Se- 
13 358-331 


4 




Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.400e- | 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


c 


QT rt rt n rt 


Type II f ibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


C 

o 


HL00023 


Type II fibronectin 
collagen -binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.54Se-27 353- 
390 


B 


BLO0 023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 


BL011S0 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 £.119e- 
09 863-917 


10 


PR00464 


E-CIASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6\l82e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 S.SOOe- 
10 89-99 PFO0023B 
14.20 2.636e-09 56-66 


14 


"DMOO031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPER FAMILY SIGNATURE 


PR00208A 12.59 d.8$8e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-S18 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


3L00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
2S 55-80 


20 


BL00487 


reductase proteins. 


ciiuu^o /Jb 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL004 87G 26.82 
4.082e-12 287-329 


21 


BL00487 ■ 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 S.737e- " 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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S3Q ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 
"25 


SL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 




"BL00115 


BuJtaryotic RNA 
polymerase II 
hep tapep tide repeat 
proteins . 


BL00115T 3.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
4 96 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 S.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BLOOllSG 
11.65 6.011e-13 435- 
463 BL00115K 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.805e- 
10 863-913 BLOOllSP 
11.54 7.538e-10 913- 
953 BL00115S 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 


26 




Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4 . 109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins . 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


28 


PR00325 


NONH ISTONE CHROMOSOMAL 

PROTEIN HMG17 FAMILY 


PR00925B 3.73 3.089e- 
i0 41-54 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins. 


BL00557D 17.76 5.065e- 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 


PR00629 


-am- ^nUornUi xROo INti 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00629F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 1.000c- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


3b 


PD0127O 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- " 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 > 


Neuromodulin (GAP -43) 
proteins. 


BL00412C 10.28 9.241e-" 
10 264-298 


38 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


39 


BLO0412 


Neuromodulin (GAP-43) 
proteins. 


BL00412C 10.28 9.241e- 
10 2S4-298 


40 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR0038QB 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-l3 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets-domaln proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13. 9S 2.452e-14 204- 
223 


45 


BL00345 


Ets-domaln proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.571e-l7 232- 
252 DM01551B 8.84 
4 . 750e-ll 214-226 


47 


PR00876 


NEMATODE MBTALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 . 


50 


BL00972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL009723 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


52 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 | 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-15 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.50 2.9l5e- 
09 57-69 


55 


PR007*2 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR007S2F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 8.800e- 
10 153-203 


5B 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.75 9.018e- 
09 206-221 


68 


PRO 03 60 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 ' 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF0OS51 


BT3 (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.B57e-12 70-81 


80 


PD02876 


DECARBOXYLASE 

PHOS PHATI DYLSER INK . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-12 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PRO 00 14 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI 3 KINASB P85 
REGULATORY SU3UNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4 . OOOe- 
10 123-154 


96 


Bli00l07 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 




PR00081 


GLUCOS E / RIB I TO L 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR0O380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
S.93 7.18Be-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


"102 


" PRO 03 00 


~ ATP- DEPENDENT PT P 

PROTEASE ATP -BINDING 
SUB UN IT SIGNATURE 


PR003 00A 9.56 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacvlal vrprnl h> < r»H i rtn 
domain proteins. 


BL00479B 12.57 6.786e- ' 
IS 298-314 3L00479A 
19.86 4.9l3e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
axtKjvi fira XZ.of 6.294C- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 kv ZK632.12 YDR313C 
END0SOMA7. TTT 


DM01970B 8.6*0 S.OOOe- "' 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins. 


3L00191K 17.38 4.951e- " 
27 238-282 BL00191J 
11.37 6.447e-17 182- 
204 


"109 


PD01066 


"PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 


BL01138A 10.96 B.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 1 
13.31 9.100e-14 225- 
241 


117 


w iJv VAX'S 


Cytosolic fatty- acid 
binding proteins. 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 B.S^Oe- 
13 36-67 


119 


PR00529 


GONADOTROPHIN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


"120 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL0021SA 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8.902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 

farn^'lv nrnfAlna 
j.ow.j.y yiwucins . 


BL01310 14.74 6.694e- 
26 28-64 


13 0 


PRO0S9O 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
13 47-67 PROO990A 
16.23 5.500e-l4 20-42 
rKUUjaui. 1Z.O.I 2.4l2e- 
09 119-133 1 


133 


BLO0880 


Acyl -CoA-binding 
protein. 


BL00880 17.52 S.576e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR0021SC 13.98 6.779e- 
10 475-496 


136 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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SSQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








3LO0O28 1^.07 S.SOOe- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BLO0028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BLO0O28 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 158-175 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8 . 688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


3' 5' -cyclic nucleotide 
phosphc di e s t eras e s 
proteins . 


BL00126C 22.07-1.450e- 
25 509-550 BL00126E 
35.22 3 .95le-16 654- 
709 BL00126D 25.50 
1.360e-15 S65-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


BL00632 


Ribosomal protein S4 
proteins . 


BL00632 23.79 5 . 271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 i 


155 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2.547e- 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases. 
zinc-binding region 1 
proteins. 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


l£S 


PR00109 


TYROSINE KINASE ] 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.'043e- 
13 139-158 


ieia 


BL00362 


Ribosomal protein si 5 
proteins. 


BL00362 24.67 9.700e-"- _ 
15 129-172 


169 


BLQ0039 


DEAD- box subfamily ATP- 
dependent he li case a 
proteins. 


BL00039D 21.67 l.OOOe- I 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
5^1 m.nnmQn i a t o* 

£ 31 Di-'UUUJ 3D -ly.lj 

4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


"179 


PDO1066 


PROTEIN ZlNC FINGER 
ZINC- FINGER METAL - 


PD01066 19.43 9.4S5e- 
36 6-45 
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ACCESSION 
KO. 


DESCRIPTION 
BINDING NU. 




180 


PRG0007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 

£M IbU-idU FR00007A 

19.33 4.938e-19 133- 

1.225e-15 206-228 
PR00007D 9 fi4 £ f>oe a 
11 238-249 


181 


BL00027 


' Homeobox 1 domain 
proteins . 


uuwwa f C9 J 7 i3ct>C~ 

24 280-323 


182 


BL00027 


' Home obox 1 doma i n 
proteins . 


a uu \J \J £. / £Q .■(J ?. Jtbfi" 

24 263-306 


183 


BL00027 


' Honeobox ' doma in 
proteins. 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 r " 


'Hoiaeobox* domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3,328e- 
09 460-471 


189 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
S.OOOe-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 C286e-13 47-69 


193 


PF005^4 


Octicosapeptide repeat 
proteins. 


PP00564B 24.74 6.164e- 
16 227-278 


194 


PR00503 


a cwj I'.uwynftiTJ a JL yNA i UfCfci 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.571e-13 170-187 


195 


'BL00901 


Cysteine 

synthase/cystathionine 
beta- synthase P- 
phosphate att. 


BL00901C 20.63 3.429e- 


197 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE ' 


PR00690A 10.85 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 ~ 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.266e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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~| ACCESSION 
J NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.06Se-16 65-87 
PR00261C 11.37 8.967e- 
15 143-165 ■ PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 6S-87 
PR00261F 11.57 7.188e- 
13 65-87 PR002613 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in zo-1 
ana Unc5-like netrxn 
receptors. 


PF00791B 28.49 6\l43e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
oILirlATUKE 


PR00007A 19.33 5.731e- 
19 131-158 PR00007B 
14 . 16 4 .115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


Ubi qui tin -conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-51 


213 


BL00183 


Ubiqui t in- con juga t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


215 


Iht nnnia 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.53 1.720e- 
11 364-388 BLO0039B 
19.19 4.064e-ll 277- 
303 


217 


BLO0100 


Chloramphenicol 
acetyl transferase 
proteins. 


BL00100D 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 


PR00213C IS. 94 3.969e- 
11 199-227 


222 

224 f 


BL00678 


Trp-Asp <wdj repeat 
proteins proteins. 


BL00678 9.67 1.947e-09 
144-155 




PR00875 


MOLLUSC METALLOTHIONE IN 
SIGNATURE 


PR00875A 5.83 l,000e- 
09 901-913 


225 ] 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e-' 
19 18-39 


226 

229 r 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 1.000c- 
21 21-38 BLO0636B 
15.11 8.200e-19 45-66 


"230 r 


PRO 03 01 . 


70 KD HEAT SHOCK PROTEIN " 

C Tf XTAM'I HIT? 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G * 
13.78 4.300e-12 361- 
382 




BLO0460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00450B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BLO046OD 
16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10,19 8.522e- 
09 273-287 . | 


233 p 

234 h 


3L00292 


Cyclins proteins. 


BL00292B 20.31 7.429e- " 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 




3RU0449 

I 


TRANSFORMING PROTEIN P21 "j 
*AS SIGNATURE ] 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 



198 



WO 01/53312 PCTAJSOO/34263 



SEQ ID NO: 


ACCESSION 
NO. 


1 DESCRIPffTiiS ' 

^— TV X IT i i. \JS% 


RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE -RT CU prnpaT 

SIGNATURE 


PRO00I9B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
i.GQOe-08 229-243 


236 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PRO0O19B 11.36 7.300O- 
10 245-259 PR00019B 
11.36 S.320e-09 113- 
127 PR00019B 11.36 
l.OOOe-08 223-237 


237 


PD00289 


PROTEIN ^Hi nriMaTwr""" 
REP2AT PRESYNA. 


PDO0289 9.97 8.448e-09 > 
67-81 


240 


PR00011 


SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00O11 


TYPE III EG F- LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-63S 


244 


BL00903 


Cytidir.e and 
deoxycytidylate 
deaminases zinc-binding 
region s . 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 8.043e- 
09 124-134 


248 


BL0Q246 


Wnt-1 family proteins. 
* 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4.857e-22 150- 
175 


"250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 S.114e- 
10 253-275 


254 


BL00674 


AAA-protein family 
proteins .' 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


"PD0179el 15.01 b\045e- 
09 61-88 


"255 


BL50002 


Src homology 3 (SH3 ) 
domain proteins profile . 


BL50002B 15.18 2.800e- 
10 421-435 


~253 


PR00094 ; 


ADENYLATE KINASE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- 
13 60-91 


262 


BL00388 


Proteasome A- type Y 
ouuuuius procexns . 


BL00388A 23.14 l.OOOe- 
40 8-54 BL0038BB 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


2^4 


BL00903 


Cytidine and 
deoxycytidylate 
deaminases zinc -binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein Kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 


BL00226 


Intermediate filaments 
proteins . 


8L00226D 19.10 l.OOQe- 
37 362-409 3L00226B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS- 








23.86 8.043e-3S 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 S.143e- 
15 96-111 


271 


PD029S2 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI. 


PD02952C IS. 76 9.731e- 
16 235-265 PD029523 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 I.OOOe- 
40 106-160 PD02929B 
18 .36 8 .800e-17 179- 
199 


274 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins. 


BL00052A 27.85 6.000e- 
13 137-184 BL00052B 
15.17 S.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.6S9e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.62Se- 
23 107-125 PR00319C 
13.41 1.000e-21 89-105 
PR00319A 15.27 8.364e- 
21 51-68 PR00319B 
11.47 8.200e-l9 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR0031SD 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 S.364e- 
21 38-55 PRO0319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.3S0e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL - BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16 .07 S.500e- 
15 322-339 BL00028 
16.07 9.471e-14 4'33- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.34Be-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 5.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 | 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-135 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PP00953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19. 68 5.000e-25 102- 
129 PF00953B 6.17 
I.000e-13 182-194 


304 




ticiiA sync nevuses class 

ii. 


PF001S2D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.250e-21 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8,250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 1 


PR00454C 11.24 7.808e- 
09 1167-1186 


309 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPER FAMILY SIGNATURE 


PR00237E 13.03 5.09le- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00S22 


DNA polymerase ramily x 
proteins . 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E"*19.63 8.615e- 
14 430-460 BL0Q52,2B 
27.30 9.625e-12 267- 
313 


"310 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 S.235e- 


312 


3L00290 


Immunoglobulins and 
maior histocomnatibl 1 i t-v 
caraplex proteins. 


BL00290A 20.89 4.706e- 
J***. lDl-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 


BLOODS 


Ets-domain proteins. 


BL00345B 21.28 1 . OOOe- 
40 34-85 BL0034SA 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
15 63-76 


317 


BLQ102Q 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4 . 814e- 
10 216^235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


• Homeobox' domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuroraodulin (GAP-43) 
proteins. 


BL00412D 16.54 4.000e- 
12 515-566 BLQ0412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.1O2e.-09 520-571 


328 

* 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.55 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65- 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins. 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e.- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.6S 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 9 .93 
8.855e-09 38-50 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0I066 19.43 2.400e- 
30 16-55 


343. 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l,000e- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 1 
11 135-154 


347" 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4:764e-"- 
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SHU ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


J 31 


3L01187 

- 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.300e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL011873 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 




REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
* 10 366-379 PD000783 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins. 


BL00380F 9.76 6.694e- 
11 542-553 


355 


""PF00628 


PHD- finger. 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


3S9 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 4.462e- 
15 261-274 PD00065 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4.300e-09 289-302 


361 

~ Vco — 


PF00791 


Domain present in ZO-1 
and Unc5-like netrih 
receptors. 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-09 184- 
239 


Jo2 

"T?t 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.273e- ' 
11 279-334 


363 


PR00450 


RECOVER In FAMILY 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12.22 3.278e-09 109- 
131 


364 


PF0O242 


DNA polymerase (viral) 
N- terminal domain 
proteins. 


PF00242Q 13.51 2.328e- 
09 22-68 




PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteina. 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins'. 


BL01160B 19.54 6.644e- 
09 1038-1092 . 


367 


PRO 0019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11. 3£ 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


F.KOU011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PRonnup 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


3L01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


3L0Q478B 14.79 7.750e- 
L2 410-425 


373 

"376 i 


PD01066 
?R00170 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL - 
3INDING NU. 

SODIUM CHANNEL SIGNATURE ! 


PD01066 19.43 9.757e- 
54 26-65 

>R00170E £.48 2.739e^ 
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(~SEQ ID NO: 


I ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


10 88-118 

BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 _.692e-12 342- 
358 


[381 


I BL00455 


Putative AMP -binding 
domain proteins. 


j3LiUU4b5 13.31 5.714e- 
12 50-66 


382 


PR00624 


HI STONE H5 SIGNATURE 


rKUQo24G 4 . OS 4.900e- 
09 524-544 


384 


PD00078 


• * * * *\w Ji m J. r%i.i XV 

NUCLEAR ANKYR. 


PUUOO/BB 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


38S 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 57-80 


386 


[ PD02870 


RECEPTOR INTFRLEUKIN-l 
PRECURSOR . 


PD02870B 18.83 tf.OOOe- " 
10 97-130 


383 


PO00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 5.000e- 
13 516-529 


383 
1 390 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.657e- 
09 151-174 




1 BbUU^lS 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 5.230e- ~ 
15 221-246 BLO0215A 
15.82 7.6l8e-14 20-45 
BL00215A 15.82 8.8 51e- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 

09 272-285 BL00215B 

10 .44 8 .500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


T397 


PROO048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 8.579e- 
11 141-155 


398 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
S06 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00S14C 17.41 4.673e- " 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4,288e- 
10 4519-4534 B100514H 
14.95 4.9S5e-10 4S84- 
4609 


403 


PF00992 


Troponin. 


FF00992A 16.67 5.974e- 
09 105-140 


404 | 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins. 


BL00232B 32.79 9.557e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SSQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.26le-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Klnesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


4X0 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
3 66 


418 


PR0023 9 


MOLLUSCAN" RHODOPSIN c- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF0O791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 7.9S5e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244. PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DMO 08 92 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 j 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BLO0O39B 19,19 
8.920e-l6 251-277 
BL00039C IS. 63 5.781e- 
15 333-357 


432 


ir XU U H 15 i. 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 | 


PR00828 


FORM IN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BLO0415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 19S-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE . 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






P15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR00568 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 S.SSle- 
09 39-53 


451 


PF00084 


Sushi domain proteins 
{SCR repeat proteins. 


PF00084B 9.45 3.8l3e- 
10 47-59 


452 


BL0079G 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 618-649 


456 


PR0038C 


K I NHS IN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.GOOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
\\xt\zxt\f t\JL\*EilrX\Jt\ 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR002S3D 16.68 5.950e- 
21 452-473 


467 


PR00849 


FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


"471 


BL00678 


proteins proteins. 


BtiUUb/a 9.67 8.200e-12 
33-44 


472 


BL00226" 


Intermediate filaments 


BL00226B 23.86 3.721e- 
09 282-330 


473 


BL00344 


GATA- type zinc tinger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 ~~ 


Thiol-activaced 
cytolysins proteins. 


BL00481E 13.07 8.909e- 
09 173-199 


479 


PR00319 


(TRANSDUCIN} SIGNATURE 


PRQ0319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
fKUJtlN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR0040SA 17.71 
4.971e-18 411-431 


482 


PRQ0049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.8S7e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT ClQ DOMAIN 
SIGNATURE 

• 


PR00007B 14.16 8.615e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
o.oabe-lif ©9o-72Q 
PR00007D 9.64 3.647e- 

■* J / JZ" J 


487 


PD00S67 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e- 
09 200-214 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- "" 
12 3-21 


489 
""490 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 




PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 feT63-678 


492 
497 


BL01128 

1 

PF00429 


Shikimate kinase 
aroteins. 

3NV polyprotein (coat 1 


3L01128A 18.84 6-464e- 
L7 58-92 

^00429 31.08 7.171e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


T.I naCPC eprino 

proteins. 


BI*0U120£ 11.37 7.923e- 
09 18S-200 


500 


BL0003 0 


Eujcaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


ww/rsp5/WWP domain 
proteins . 


BJj01159 13. 8d 8.579e- 
12 131-146 


505 


BL00021 


iu.*j^i w m « iwt in pzrotejLXis. 


du\}\j\jz±b 13.33 3.73 9e- 
17 492-510 


508 


PR00120 


(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM01417 


MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycoeyl transferases 
group 1. 


PF00534B 14.47 6.62Se- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group l. 


PF00534B 14.47 6.^25e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1. 


PFO0S34B 14.47 5.625e- ™ 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 l.OOOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.3B6e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.183e- 
12 410-423 


516 


DMO0 B92 


3 RETROVIRAL PROTEINASE. 


DM00B92C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins . 


BL00242C 16.36 8.320e- 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION. - 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


"525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e- 
10 61-95 


526 


PF0O789 


Domain present in 
ubiqui tin-regulatory 
proteins . 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 1.500e- 
16 120-164 
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SEQ ID NO: 


ACCESSION 


■ DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR0O91OA 2.51 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.32 8.660e-U 123- 
14 8 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


3L00215A 15.32 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl -enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 3L00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN- CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BLOO028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00O28 16.07 
1.346e-ll 369-336 
BLOC028 16.07 1.692e- 
11 397-414 BL00028 
16.07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 


53 7 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BLO 076 2 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
lo.u/ b.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 73-95 


545 


BLOO2S0 


TGF-beta family 
proteins. 


BL00250A 21.24 8.0O0e- 
31 293-329 BL002S0B 
27.37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDDCIN) SIGNATURE 


09 186-201 PR00319A 
15.27 7.344e-09 210- 
227 


54 8 


BL01204 


NF-kappa-B/Rely'dorsal 

uuiuaJ.it piULcins . 


BL01204A 17.74 l.OOOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-33 225-250 
BL01204C 13.93 3.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 j 


~549 


PR00326 


PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 1 
IS 255-276 | 


551 


PF00632 


n&ui -aomain luoxquicin- 
transf erase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 1 
1543 [ 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- 
14 187-205 BL00290A | 
20.89 2.059e-14 130- 
153 j 


557 




PROLINE -RICH PROTEIN 3 . 


DM00215 19.43 6.339e- 

09 846-879 [ 


559 


DM01111 


TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Poly-adenylate binding 
protein, unique domain 
proteins. 


PF00658C 16.33 9.455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472-488 


566 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e- 

15 272-289 | 


567 


PD010S6 


PROTEIN ZINC FINGER 

i 1 \LNGEx. METAL- 
BINDING NU. 


PD01066 19.43 4.977e- 

13 229-268 I 


569 


BL00107 

i 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7.000e- j 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18. }9 7.000e- 
19 118-149 BL00107B 
13.31 5.S00e-15 183- J 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- [ 
34 454-483 PR00193C 
12.60 2.636e-31 223- ! 
251 PR00193B 11.69 j 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 | 


573 


PR00193 


MYOSIN HEAVY CHAIN 


PR00193D 14.36" 1.8S7e- 
34 470-499 PR00193C I 
12.60 2.636e-31 239- j 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- ! 
22 115-135 PRO0193E j 
19.47 6.559e-19 524- 
553 


575 


BL00752 ■ " 


XPA protein. 


BL00752B 19.17 9.703e- ! 
10 885-929 


576 


BL0003Q 


Eukaryotic RNA-binding 
region RNP-l proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BLOOllfi 


DNA polymerase family B 


iW)0116A 12.81 5.737e- | 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 
11.82 l.S29e-12 952- 
965 


578 


BL00195 


Glutaredoxin proteins . 


BL00195B 15.31 7.1SSe- " 
09 121-141 


579 


PR00019 


LEUCINB-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.3S0e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


"580 


PR00253 


GAMMA-AMINOBDTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 

i 


PR00253A 9.15 2.125e- "" 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.58 
S.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.236e- 
11 1233-1252 PR00343C 
IS. 85 5.500e-ll 333- 
352. PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 




kw SKI2W SKI 2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.53 9.49le-30 916- 
963 DM01537A 15.14 
3.196e-ll 784-804 


586 


PFC0013 


KH domain proteins 
raraiiy or RNA binding 
proteins . 


PF00013 5.78 1.450e-09 
124-136 


587 


DM0 0 8 9 7 ^ 


J KJSiKUVIRAL PROTEINASE. 


DM00892C 23.55 4.409e- 
13 262-296 


589' 


BL00473" 


LIM domain proteins . 


BL00478B 14.79 1.643e- ' 
13 261-276 BLO0478B 
14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13. 7S 8.000e- 
15 931-948 


591 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 1062-1079 


593 


PF00628 " j 


PHD- finger . " 


PF00628 15.84 3.455e- 
12 424-439 


T94 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 2.241e- 
16 558-576 PR0020SA 
14.73 9.300e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


595 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


593 
"600 


PD0167S 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 




BL00242 


Integrins alpha chain 
proteins . 


BL00242K 9.03 9.591e- - 
27 985-1014 BL00242C 
16.86 4.1lSe-26 286- 
316 BL00242D 13.57 
4.150e-25 357-382 
9L00242B 8.13 7.353e- 
L2 189-199 BL0O242D 
L3.57 3.455e-ll 421- 
146 BL00242A 13.80 | 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








o.uuue-ii 01-/4 
BL00242D 13.57 4.986e- 


601 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 5.610e- 
09 198-217 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.5b'9e- 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domad.ii proteins. 


BL00479C 12.01 3.250e- " 
io iTn-iai 


604 


BL00315 


Dehvdrins nroteins 


uluujida 3.Jj l.o/2e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 ' 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOQe- 
13 335-358 


608 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 265-282 


609 


PFQ0855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8,29le-09 767-787 


615 " 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.24le- 
16 410-428 PR00330C 
13.18 2.976e-13 436- 
455* 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
4S5 


618 


DM01206 


PROTEIN. 


DM012C6B 10.69 5.143e- 
12 531-551 DM01206B 
10.59 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR007COB 16.80 3.160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine Kinase 
class II proteins. ! 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC MOLYB DO PTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- j 
09 326-339 


624 


BL00641 


Respiratory-chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins. 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1.818e- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


527 


PR00103 


CAMP -DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PROO081A 10.53 6.211e- 
16 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk> domain proteins. 


PF00551 15.00 B.500e- 
14 37-50 


632 

c 7 g — 


DM0120S 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.6-9 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10 . 69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 




3L0O1O7 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins. 


BL00657A 19.39 1.54Se- 
30 101-143 BL006S7B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


"643 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


"647 


&F00628 


PHD- ringer . 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


643 


BL01129 


Hypothetical 
yaJDO/yceC/sfhB family 
proteins . 


BL01129E 13.25 4 . OOOe- 
25 332-3S7 - BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 


649 


BL0122 8 


Hypothetical cof family 
proteins. 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


' Homeobox ' doma i n 
proteins. 


QuKi UUi ' ,1J o . bole- 

13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL5C002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR002S3B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


| PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-r and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


6S9 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.518e- 
09 224-236 


661 


BL00027 


' Homsobox ' domain 
proteins. 


BL00027 26.43 5.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e~ 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.1S8a- 
10 595-610 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR0036OB 13.61 7.158e- 
10 596-610 


666 


PR00819 


CBXX/CPQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.90Ge- 
10 704-720 


667 


BL50040 


Elongation factor 1 


BL50040C 22.62 2.143e- 






gamma chain profile. 


16 135-178 


668 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PRO0O19A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BLO0018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 3.25Ce-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR . 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR0O667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PRC0667G 15.33 7.5S7e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 63S-6S0 PR0032QC 
13 .01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-09 593-608 


675 


PR0U320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.1lSe-12 614- 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PRO 00 19 


i^uv^iJNJSi— KAt_ri ivhiPiiAT 
SIGNATURE 


PR00019A 11. IS 9.667e- 
09 249-263 


679 




Zinc finger C-x8-c-x5-c- 
x3-H type (and similar) . 


PF00642 11.59 3.700e- 
16 225-236 PP00642 
11.59 7.900e-12 187- 
198 


630 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.7S4e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
bindihg domain proteins. 


BL00019D 15.33 4.200e- 
19 227-257 


"682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000s- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BLO1024 


Protein phosphatase 2A 
regulatory subunit PR55 
proteins. 


BL01024A 10. 2$ l.OOde- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13 .22 1 .000e-40 185- 
222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II, 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


'Homeobox' domain 
proteins . 


BL00027 26.43 8.071e- 
31 152-195 


692 " 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 45-57 


693 




ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-S7 


694 


BLO0211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BLO068Q 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL0068O 14.37 5.304e- 
17 173-195 


697 


TXT f\n 1 A i 


Guanine - nucl eot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


6 98 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 8.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2 . 174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 

X*4 / fKUUU4oA 10.52 

8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- " 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 S.909e- 
15 86-98 BL00523C 
L2.64 5.500e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PRO0048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007G7A 14.84 8.941e- 
14 66-82 


708 


PR00761 


BIND IN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8 . 500e- " 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 

• 

l 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13 .86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD- box subfamily ATP- 
dependent hel leases 
proteins. 


BL00039D 21.67 7.545e- 
27 450-496 BL00039A 
18.44 2.S37e-18 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BL00039B 19.19 1.947s- 
13 194-220 


715 


BL00333 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 1^.80 3.750e= — 
39 20-68 DM00031B 
15.41 2.688e-28 84-118 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BLO0243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.62Se- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
D.Jv4e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 B.022e- ~ 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE <C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- " 
34 135-161 PR00704P 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.07le-26 165-189 J 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.o52e- 
09 169-187 ' 


"72^ 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


u-rKUiai« BstlJ\ WiJ-4U 
REPEAT SIGNATURE 


PR00320C 13.01 2.125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR0032OC 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6 . 586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PRQ0320B 12.19 
6.914e-10 277-292 


731 


PR60195, 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF0064 2 


Zinc finger C-x8-C-x5-C- 
X3-H type {and similar). 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039A 18.44 2.565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 BL00039C 15,63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


"739 


BL01289 ■ 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 




ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.078e- 
12 41-81 


743 




Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-2l 60-7B 


748 


BL00612* 


uotconectin aouiain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVERIN FAMTI.Y" 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 


752 


BL00795 




BL00795C 17.06 6.0O0e- 
11 384-429 BL00795C 

1*7 nC O A A A o n lift 

415 


754 


BL00051 


proteins. 


16 4-50 


755 


DM01970 


0 kw ZK632 12 YDR313C 
END0S0MAL III. 


unvLy/Uo o . du 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 15.35 9.020e- 
12 99-150 


762 


3L00046 


Histone H2A proteins. 


BL0004b i 12.95 l.OOOe- 
40 33-88 


7*3 


PD02411 


PROTEIN TRANSCRIPTION j 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


3L00027 


' Homeobox 1 doma i n 
proteins . 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


BLO1203B 15.83 6.063e- 
10 309-324 BL012O8B 
15.83 8.031e-10 165- 
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j SEQ ID NO: 


1 ACCESSION 
1 NO. 


DESCRIPTION 


RESULTS* 


T770 






180 BL01208B 15.83 
4.l62e-09 85-100 




TBLO0031 


~ Nuclear hormones 

receptors DMA-binding 
region proteins. 


BL00031A 19.55 9.57le- 
32 -208-241 3L00031B 
.43 j.juuc"* / ^42- 
274 


772 


PRO0449 


TRANSFORMING PROTEIN P21 
HAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.S20e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 


773 


BL0O523 


S'Jllia tafifR nrnt-oS r»o 


BL00523E 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
lJ 31-103 BJ»00523D 
9.e9 7.923e-12 224-236 
BJjLUa^JL. 1J.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O028 


Zinc finger, C2H2 type, ~ 
domain proteins. 


BL00028 16.07 7.686e- 
09 568-585 


776 


BLOO028 


Zinc fxnger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621-638 


777 


f"BL00028 


Zxnc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL00030 


Eulcaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 


779 


PR00079 


GLUCOSE - 6 - PHCS PHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4 .l50e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 




BLO 02 15 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6-00Qe-16 221- 
246 BL00215A 15.82 
/.as>/e-i2 1D8-133 
BL00215B 10.44 9.526e- 
11 168-181 


| 783 


PD00239 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BL00690 


DKAH-box subfamily ATP- 
dependent helicases 
proteins . 


BL00690B 13.38 l.OOOe- 
12 147-165 BLQ0690A 
o.o/ d . jzue-10 114-124 
BL00690C 7.51 3.189e- 


786 


PR00449 


TRANSFORMING PROTEIN P21 " 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


["788 

I 790 I 


JM01205 


ZORONAVIRUS NUCLEOCAPSlD ) 
PROTEIN. 


JM01206B 10.^9 8.767e- 
10 1-21 




iL00915 j 


Pnosphat idyl inositol 3- 1 
md 4 -kinases proteins. : 


3L00915C 22.43 9.182e^~" 
J9 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.Q50e-33 633- 
671 BLO0915D 27.02 
1.529e-2i 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 

/ 


GLIADIN AND LMW GLUTEN iN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 6.294e-~ 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PRO0208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR0O2O8A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CAD HER IN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 

• 


DiiOD412 


Neuromodulin (GAP -43) 
proteins . 


BL00412D 16.54 4.000c- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- ' 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 


799 


BL01052 


Calponin family repeat 
proceinB . 


BL01052C 18.51 l.OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-2S 174- 
194 


800 


BL00343 


p53 tumor antigen 
proteins. 


BL00348F 23.19 3.714e- 
09 197-240 


801 


BL00309 


Vertphr«l*p o»lart'i > ic^»> 
• citcuiatc yai.acc.y5icic 

binding lectin proteins. 


BL00309C IB. 65 l,621e-* 
09 62-87 


802 


PRO 024 5 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyridine | 
sensitive L-type calcium 
channel {Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SCL/ Lu iNU: 


ACCf^SSION 
NO. 


DESCRIPTION 


RESULTS * 






PHOTOSYNTHESIS . 




811 


BLO0S8S 


CBF-A/NF-YB subunit 
proteins. 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR0008O 


ALCOHOL DEHYDROGENASE 
j SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL003S7 


Histone H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


POO0066 


PROTEIN ZINC-FINGER 
METAL -BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PE00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleukin-10 family 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ub a qui cm carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVOPROTE IN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732c- 
28 88-124 PD02655B 
8.36 6.478e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR0O4O5B 11.83 7.000e- 
21 44-62 PR004Q5C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.2830- 
13 25-45 


831 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PROO019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PR00011C 24.25 5.415e- . 
12 231-260 PROOOllD 
14.03 9.8S2e-ll 212- 
231 


334 


nnnn inr 
PD003U b 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


fc'KvjijXWc.-KlLn rKQTEIN J . 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTBIN . 


PD02784B 2$\46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








ll 538-549 PR00700E 
17.57 3.100e-10 522- 
538 




PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5".404e- " 
13 134-153 


844 


PDC2785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING KEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.915e-28 8-57 




or rAoic 


MARCKS family proteins . 


BL00326C 7.63 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL00S18 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


851 


PD024H 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR . 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speracc receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
388 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270-. 
325 3L00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 S.731e- 
23 55-110 BL00420B 
22.67 5.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 S.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.32le-38 966- 
1021 BLOO420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.5Q0e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.20Se-26 163-218 
BL00420B 22.67 5.73le- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.900e- 
13 3SS-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BLO042OC 
11.90 S.ll9e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-10 S67-578 


857 


PR00388 


3 ' , 5 ' -CYCLIC NUCLEOTIDE 
CLASS II 

PKOS PHODI ESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


8S9 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


3L00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 ; 
BL00030A 14.39 2.000e- 
10 128-147 


861 




URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
S.95 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 B.071e- 
12 41-54 


864 


PR00775 


SO KO HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR0077SA 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-2S7 ' 


866 


num coo 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


867 ' 


tru uiUQb 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


868 


OLJIJ X £. o / 


RNA 3 -terminal 
phosphate cyclase 
proteins • 


BL01287A 17.95 2.68Be- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL0O04S 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biotin-requiring enzymes 
a 1 1 achment site 
proteins . 


BL00188 30.29 9.036*6- 
32 665-711 


876 


3LG0028 


Zanc finger, C2H2 type, 
uuiruij..i proteins. 


BL00028 16.07 7.686e- " 
09 298-315 


877 


PD02102 


"SUBUNIT E V-ATPASe' " ™ " 

vhvuulihk Air oIN InAofi 

HYDROL . 


PD02102A 16\74 4.17^e- 

1 fx t\ 1 i a n 

10 97-141 


879 j 


BL01189 


RiJoosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01199B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


196 


PRO 03 91 


PHOS PHATIDYL INOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


ICE NUCLEATION PROTEIN " 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


NO. 




RESULTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


3LO0039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 1S3-179 
BL0003 9C 15.63 9.460e- 
11 236-260 


901 




r KUl£4£f ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
Q.200e-16 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 


902 


BI.0111S 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


PRO 0381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8. 75 6.S86e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.0B4e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.18le- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 

Q Tf^M&Tr IPT7 


PR00345C 4.54 8.55 7e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATTIRJ? 


PR00345C 4.S4 8.557e- 
09 513-537 


908 


BL00678 


Trp-Asp (wd) repeat 
proteins proteins'. 


BL00678 9.67 9.308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


proteins. 


BL01104C 15.14 6.000e- i 
09 364-392 




3L00678 


Trp-Asp (wdj repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 " 
500-511 


923 


PROO320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13 .01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDE 
REDUCTASE PHOTOSYNT . 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00019 


Actanin-type actin- 
binding domain proteins. 


BL00019C 14.64 7.453e- 
25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


rrp-Asp (WD) repeat 


3L00673 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-294 BL00678 9.67 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL0051S 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulose- phosphate 3- 
epimerase family 
proteins . 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL0108SE 18.87 8.676e- 
20 172-202 BL0L085C 
21.81 2.038e-14 66-97 




D T f~\ "1 HOC 


Ribulose -phosphate 3- 
epimerase family 
proteins. 


BLC1085D 16.55 4.600e- 
24 152-183 BL01085B 
10. IS 5.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 190-220 BL0108SC 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301A 10.24 6.400e- 
09 160-171 


93 6 


PF00168 


C2 domain proteins. 


PF00168C 27.49 4.000e- 
12 336-362 


93 7 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e- 
10 5-49 


94 0 


PR00862 


PROLYL OLIGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16.17 4.086e- 
09 63-84 


945 


BL01230 


RNA methyl transferase 
trmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


94 8 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- 
.18 52-68 BL00479A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OXIDOREDUCTASE 
NAD INTERGENIC RE. 


PU01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as 3R- 
C/Ttic) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also Jcnown as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


J73 / 


£1*00379 


CDP- alcohol 

phosphatidyXtransf erases 
proteins . 


BL0D379 24.^4 l.^lOe- 
15 111-148 


"~959 


BIi01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


9S2 


BL00061 


Short -chain 

dehydrogenases/reductase 
a family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 




PK0030B 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM02206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM0120SB 10.69 1.286e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
LMui^ObB 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PFQ10083 25.59 4.724e- 
31 417-460 PF01OO8C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonucleaee PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL01159 


WW/rspsywwp domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e~ 
09 55-94 


978 


BL01167 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 6.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins. 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQOESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312P 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.6B8e-34 363- 
392 PR00312D 9.43 
2.636e-33 126-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PP60992 


Troponin. 


PF00992A 16.67 8.816e- 
09 414-449 


982 


PR00299 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A- 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL0079SC 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17,06 7.400e-09 11-56 j 
BL00795C 17.06 7.800e- 
09 3-48 


987 


3L0093 9 


Ribosomal protein Lie 
proteins. 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 1 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR004 52 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


BL00027 


' Home obox 1 doma i ri 
proteins . 


BL00027 26.43 2.500e- 
25 146-189 


997 


TIT -I 1 f\ A 


ubiH/COQ6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN . 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.7S0e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


B 1,00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.7O0e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


""BL00406 


Act ins proteins. 


BL00406B S.47 l.OOOe- 
40 88-143 BL0O406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


1007 


PRD0304 


TAILLESS COMPLEX 
j POLYPEPTIDE l 
I (CHAPE RONE) SIGNATURE 


PR00304D 11.04 6*.714e- 
22 384-407 PR00304C 
8.69 4 .667e-20 98-118 
PR0O304B 11.60 7,577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.B70e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINtiER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 1.000e~ 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phoephoglycerate mutase 
family phosphohistidine 
proteins. 


BL0017SA 15.42 5.179e- 
12 6-26 BL00175C j 
23.75' 8,062e-10 79-111 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


~r3L00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14 . 83 8.844e-ll 288- 
335 


1028 


BXjO0183 


Ubi qui tin -conjugating 
enzymes proteins. 


BL001B3 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASE/ EPOXIDE 
HYDROLASE FAMILY 

C TfTM A TTTD 17 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. ! 


PD01066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 I5\6l 4.259e- 
11 55-82 


1639 


BL00299 


Ubiquitin domain 
proteins. 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PRO097OA 17.73 6.143R- 
20 56-78 PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e.-lS 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR0004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR0004 8A 10.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- I proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


1047 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins. 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 7.C10e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins. 


BL01073 24.30 l.OOOe- 
40 12-62 


1054 


BL0O571 


Amidaees proteins. 


BL00571 25.6"9 5.875e- 
31 160-212 


1055 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins . 


BL00030A 14.39 5.235e- 
11 98-117 BL00030B 
7.03 4.316e-09 137-147 


-1058 


"BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-I4 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


' Home obox 1 doma in 
proteins . 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13. 3i 6".211e- 
13 280-296 


1065 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


"1066 


PK00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR003 26A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1 .290e-14 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02B70 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 B.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PF00856A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5/PR- 1 /Sc7 
proteins. 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CAR BOX Y PE PT I DAS E C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL0021SA 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.316e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


descriptX6n 


RESULTS* 






proteins proteins. 


298-309 


1081 . 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7.39Ee- 
10 23-57 


1094 


BI.00460 


Glutathione peroxidases 
selenocysteine proteins . 


BL004 60A 28.67 3.204e- 
18 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02B11B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B ! 
17.07 2.263e-21 lll- 
144 PD02811C 13 .25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF0O881 


Nitroreductase family. 


PF00881A 27.15 9.229e~ 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5 . 737e- J 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 


1116 


BL0O355 


HMG14 and HMG17 
proteins. 


BL00355 5.97 2.528e-25 
20-51 


1117 


BL0O355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLO01O7B 13.31 4.B57e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDR6LAS£ 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR001B6 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
09 87-101 


1129 


BL0017O 


Cyclophilin-type 
peptidyl -prolyl cis- 
trans isomerase 
signatur. . 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BL0O636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL0O678 


Trp-Asp (WD} repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins. 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
DL00990B 20.15 2.125e- 
27 157-187 BLO0990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 B.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14.53 1.28le-22 13-34 


1139 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein Kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.00Ge- 
19 451-492 BL00107B 
13.31 3.377e-12 519- 
535 


1148 


PR00685 


. TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02 894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00S23 


GMC oxidoreductases 
proteins. 

! 


BL00623E 15.00 3.531e- ' 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


i DNA PROTEIN POLYMERASE 
ENDONUCLEAS E DNA- . 


PD01937A 6.68 3.47Se- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEAS E DNA- . 


PD01937A 6\68 3.475e-' — 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e^ 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032G 8.33 1.422e- — 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- " 
10 205-220 PR00320C 
13.01 ,7.840e-10 205- 
220 PR00320B 12.19 
8.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4 . 150e- 
19 765-784 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BL00720 


Guanine- nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12. 6$ 2.76le- 
10 77-93 


1188 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 

81 , 


BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3,625e-13 379-402 
BL0087BD 16.56 1.621e- 
09 270-289 


1191 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 l.OOOe-ll 224- 
252 


1193 


PR0034 5 


STATHMIW FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PR00345B 
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"ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR0034SD 
10.97 1.964e-24 125- 
149 PR00345A 13.4 6 
5.645e-16 43-62 


1194 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800O- 
28 10e-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1 . 9649-24 161- 
185 PR00345A 13.46 
5.645e-16 79-98 


1195 


PP00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00992 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 6.738e- 
11 15-47 


1197 


BL01298 


Di hydrod lpicolinate 
reductase proteins. 


BL01298A 13.90 5.959c- 
09 51-73 | 


1203 


BL000S1 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 1 . OOOe- 
14 152^190 


"1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR0011BF 16.42 9.386e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

me thyltransf erase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27 71 8 ^l^p-^l Tea 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins. 


BL00979L 20.63 2.4 85e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 

11 49-65 P5*OOn21R 

14.20 1.8l8e-09 45-55 j 


1212 


PR00 04 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR0004QA 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR00456 


RECOVERIN FAMILY " 
SIGNATURE 


PR0O45OC 12.22 1.720©- 
10 20-42 PR004S0C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulln (GAP- 43) 
proteins. 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PRO0456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 5.348e- 
11 249-264 


1222 


POO0066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.231e- 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


1226* 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL011SO 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 6-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- " 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.5<J3e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


•Hpmeobox' domain 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.104c- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL001B3 


Ubiqui tin -conjugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL0ill5A 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphor ibosylglycinamid 
e fprmyl transferase 
proteins. 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-UKE 
SIGNATURE 


PROOOllB 13.08 3.217e- 
10 174-193 


1259 


BL00518 


Zinc finger, C3IJC4 type 
(RING finger) , proteins . 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PRO 00 70 


D I H YDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 5.500e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-2O 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-typc, 'helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00B37C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.5l2e-12 86-105 
PR00837D 11.12 7.577e- 
12 201-215 


1269 




TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins. 


BL00276A 8.B7 l.SOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9.769e- 
09 220-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-li 100-119 


1277 


PF00756 


Putative esterase. 


PF00756C 14.12 9.53Be- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1280 


BL01220 


Phospha t idyl e tha nol ami ne 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 2.286e~ 
10 33-42 


1287 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16.51 1.610e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.S71e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.25Qe- 
09 290-306 


1307 


BL00215 "~ ■ 


Mitochondrial energy 
transfer proteins. 


BL06215A 15.82 S.SOOe- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.682e- 
09 552-572 


1309 


PD60301 


PROTEIN REPEAT" MUSCLE 
CALCIUM- BI . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins. 


BL00194 12.16 1.900e- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
proteins. 


BL00783C 22.43 6.559e- 
24 07-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta- cat enin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


"1329 


BLO003 0 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
rALiOR F40 SIGNATURE 


PR00497A 6.92 7..239e- 
09 25-43 


1332 


PROOl^l 


NICKEL- DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01666 19.43 6.769e- " 
33 10-49 


133^ 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


FR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 J 


PROTEIN TYROSINE 


FR00700D 12.47 2.200e- 
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pEO ID NO:" 


ACCESSION 
NO. 


" DESCRIPTION 

PHOSPHATASE SIGNATURE 




I 1340 


PR00860 


VERTEBRATE 
METAL LOTH I ONE IN 
SIGNATURE 


09 211-230 
13 5-18 


1341 


~~BL00893 


~ mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 
1 1344 


BL01282 


BIR repeat proteins. 


BL012B2B 30.49 5.974e- 
21 383-422 




DM00099 


4 Xw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 8.313e- " 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 5,935e- " 
10 135-146 


1348 


PF00£5l 


BTB (also known as BR- 
C/Ttk) domain proteins. 


FfUUbaX 15.00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


[ PR00193D 14.36 3.57le- 
32 416-445 PR00193C 
12.60 6.318e-3l 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR0O193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


I PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 5.877e- 
10 353-373 


1353 ■ 


BL00303 


o Auy/iiat3f c ype calcium 
binding protein. 


BI..00303A 21.77 6.667e- 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD -box subfamily ATP- 
proteins . 


BL00039D 21.67 5.950e- " 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF006l£ 


Regulator o£ G protein 
signalling domain [ 
proteins. [ 


PF0061SB 16.25 2.216e- 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


pTeo 


PD01066 


PROTEIN ZINC FINGER 

ZINC- FINGER METAL- \ 

BINDING NU. J 


PDOIO^ 19.43 9.234e- 
29 10-49 [ 


1361 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY ! 
SIGNATURE S 


PR00925A 5.47 5.091e- 
IB 14-29 PR00925B J 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 


protein family proteins. 


Hli01272B 19.61 6.870e- 
30 136-171 BL01272C 

11 rift "3 Ti/iQ oc o>i a 
j. j_ . o o j.ji^e-iSb 249- 

274 BL01272A 6.49 

1.231e-18 99-117 


1363 


BL01272 


GlucoJcinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 1 
1.231e-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 167-177 


136B 
1370 


t>K0Ol49 

PR00988 '] 


POTASSIUM CHANNEL 
SIGNATURE | 
URIDINE KINASE SIGNATURE J J 


PR00169A 16.77 1.592e- 
39 76-96 

PR00988A 6.39 1.794e- 
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SEQ ID NO: 


NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BL00242 - 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8.615e- 
09 469-479 


1372 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR0062SA 
12.84 I.391e-16 14-34 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins . 


BL00434C 23.85 3.770e- 
09 90-130 


1374 


PR009^2 


LETHAL (2) GIANT LARVAE 
FKuifcAN SIGNATURE 


PR00952C 8.00 6.337e- 
09 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE . 


PD02475A 23.18 8.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


1380 


BTiOOl 94 


Thioredoxin family 
proteins. 


BL00194 12.16 8.333e- 
12 48-61 


1381 




0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
1S 1123-1136 


1363 


oJjUUd / o 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 
-TTflfi 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 t>.203e- " 
10 95-132 


XJOO 


BL01160 


Kinesan light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 1574-1628 


T -yon ~ 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


"1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER KETAL- 
BINDING NU. 


PD01066 19.43 3.6O0e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C I 
13:18 6.538e-16 243- 
262 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
13 .92 8.800e-14 348- 
361 PD00066 13.92 
9.571e-12 405-418 • 
PD00066 13.92 6.087e- 
11 490-S03 PD00066 
13.92 8.043e-ll 320- 
333 


1398 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PDO093O 


PROTEIN GTPASIi* DOMAIN 

ACTIVATION. 


FDUQ930A 25.62 7.324e- 
15 363-389 


1407 


BL000J0 


Eukaryotic RNA- binding 
region RNP-i proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11. id 9.SS0e- 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ALUjOO J. UN 

NO. 


DESCRI PTION 


RESULTS* 


1409 


" PRO0510 


NEBULIN SIGNATURE 


09 176-190 

PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK " 
NUCLEAR ANKYR . 


PD00078B 13.14 5.696e- 
09 31-44 


1412 




Ribosomal protein L5 
proteins. 


BL00358B 22.76 l.OOCe- 
40 57-103 BL00358C 
13. 7S 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.93le- 
11 33-44 


1414 


DU\J\f 6oZ 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 1^,88 7.338e- 

10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


1418 


DM00973 ■ " 


3 Jew RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PR00319 


BETA G- PROTEIN 

{ TRANSDUCIN > SIGNATURE 


PR00319B 11.47 1 . 571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


1422 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 6.318e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PF00628 


PHD- finger. 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PF0062B 


PHD- finger. 


PF00628 15.84 3,045e- 
12 377-392 


1427 


PR0040S 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00039 


DEAD- box subfamily ATP- 
dependent hell cases, 
proteins. 


34 147-193 


1429 


PRO 03 20 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 
1431 


PRO 03 78 

PR00928 < 


INOSITOL PHOSPHATASE 
SIGNATURE 

JKAVES DISEASE CARRIER f~ 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 

PR00928B 13.53 3.769e- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 




PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.9B3e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


14 3 8 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00BO6 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00BO6 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Gramns proteins. 


BL00422D 19.48 1,000c- 
08 114-138 


1445 


PD01B41 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD018413 
14.35 l.OOOe-40 144- 
185 PD01B41D 17.87 
1.000e-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01B41O 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01B41H 
21.30 3.189e-3l 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS his tone family. 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 




DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


•1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-i proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR . 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins. 


BL00545C 11.26 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.06"9e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins. 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1477 


PF0O566 


Probable rabGAP domain 
proteins. 


PFO0566A 12.64 7.333e- 
10 466-476 


1478 


BL0 003 0 


Eukaryotic RNA-biading 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM0O406 


GMADIN. 


DM00406 7.73 8.541e-10 
292-305 


1480 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PHOSPHOENOIiPYRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.03Se- 
09 21-S1 


1482 


PF00780 


Domain found in NIK1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 1.153e- 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- " 
25 17-56 


1486 


BL00107 


Protexn kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- " 
09 34-50 


. 1488 


BL0 0039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 9.586e- 
10 116-162 


1490 

i 


BL00166 


Bnoyi-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.507e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B IS. 92 
9.357e-ll 93-115 


1491 


BL0 0452 


Guanylate cyclases 
proteins. 


BL00452D 28.59 3 . 700e- 
31 63-106 DL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1 . OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1S02 


BL00027 


'Homeobox' domain 
proteins. 


BL00027 26.43 4.789e-" . 
24 112-155 


1503 


BL00027 


' Homeobox ' doma in 
proteins. 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins. 


BL01177E 20.64 5.800e- " 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 ' 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 B.7S9e- 
10 341-363 


1512 ; 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 . 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00606A 17.98 6\l43e- 
19 98-122 BL00600E 
16.43 1.77le-17 302- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








331 BL006C0G 12.43 
9.62Se-17 377-396 
BL00600B 15.60 5.091e- 
15 160-186 BL00600C 
16.18 6.04Ce-l2 190- 
206 BL006C0F 8.77 
1.000e-ll 343-356 
BLO0600D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR0D320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


1*3 8 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0.C0 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR0096SC 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 1 . 000e- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 9.719e- ! 
17 163-207 


1543 


PD02699 


PROTEIN DNA-3INDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.2B6e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.857e- 
10 102-197 PR00049D 
0.00 7.102e-09 67-82 


1547 " 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BLO0951D 
13.94 8.-714e-40 142- 
177 BL00951A15.10 
1.000e-3B 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqu it in -activating 
enzyme proteins. 


BL03536F 13.65 B.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PRO 013 9 


ASPARAGINASE /GLUT AM I NASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.119e- 
09 58-73 
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SEQ 10 NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BL00061 


Short-chain 

dehydrogenaseo/reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.l05e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.10Se- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.l05e- 
12 107-132 


1562 


BL00522 


DNA polymerase family x 
proteins . 


BL00522C 11.90 6.600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.l23e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTS (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947o- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 8 . S94e- 
17 184-228 BL01013C 
9 .97 4 .906e-12 14-24 


1567 


BL0067B 


Trp-Aap (WD) repeat 
proteins proteins. 


BL00678 9.67 3.400e-10 
378-389 BL00678 9.67 
5.800e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 


"1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.62Se-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR0066"* 


OXYTOCIN RECEPTOR 
SIGNATURE 

* 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9,93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C ' 
5.89 1.000e-20 65-80 
PR0D665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDR0PTERID1NE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins. 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02894 • 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 S.292e- 
12 32-54 BL00411H 
15.66 4.44le-ll 245- 
276 


1582 


PR006O4 


CLASS I A AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 7. . 440e- 
09 79-87 


1584 


PF006S1 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 l.OOOe- 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


HM01354S 11.61 7.750e- 
09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1587 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.95Se- 
33 180-210 PR00072A 
12.75 6.040e~25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENUOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 117S-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


1592 


BL00037 


Myb DNA-binding domain 
proteins repeat proteins 
proteins. 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.S14e- 
09 110-127 


1598 


PP00628 


PHD-finger. 


PF00628 15.84 3 .250e- 
11 1667-1682 


1599 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR0D014D 12.04 5.500e- 
09 980-995 


1666 


BL00518 


Zinc finger, C3HC4 type 
(RING finger}, proteins. 


BL00518 12.23 6.571e- 
10 30-39 


1602 


BL00412 


Neuromodul in (GAP-4 3 ) 
proteins . 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.57le- 
10 44-57 


1607 


BL00252 


Interferon alpha, beta 
and delta family 
proteins. 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Protein 

prenyltransferases alpha 
subunit repeat proteins 
proteins. 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168- 


C2 domain proteins. 


PF00168C 27.49 3.250e- 
09 365-391 


1613 ■ 


BL00412 


Neuromodul in (GAP-4 3 ) 
proteins. 


BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL00S59J 19.63 
6.870e-16 124-176 
BL00559L 13 . 60 9.000e- 
16 266-284 


1615 


PD01427 


"TRANSFERASE 
METHYLTRANS FERASE BI . 


PD01427B 22.45 3.02Se- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








472 


1616 


BL0011S 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.485e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-100/lCaBP type calcium 
binding protein. 


BL00303B 2£.l£ 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI. 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01B88A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 d.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins. 


BL00064B 23.57 l.OOOe- 
40 82-130 3L00064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700a- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 l.lOSe- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL0121Q 


Caveolins proteins. 


BL01210B 13.92 9.53le- 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COOS 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM- POSITIVE COCCUS 
SURFACE PROTBIN ANCHOR 
SIGNATURE 


PR00015B 9.94 8.468e- 
10 128-149 


i fid i 


f HUU J £. 1} 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 1 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00320A 16.74 2.098e- 
09 229-244 


1642 


" PP00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 l.BOfie- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosoraal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


1646 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6.308e-18 '386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


1647 


DM01242 " 


3 THREONINE- -TRNA 
LIGASE . 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 S.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA . 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 6.720e- 
11 431-485 


1652 


BL00933 


FGGY family of 
carbohydrate kinases 
proteins, 
i 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BLO0795 


involucrin proteins. 


BL007S5C 17.06 2.98Be- 
10 70-115 


1*54 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.750e- 
17 302-334 


1655 


BL009D2 


Bacterial -rype phytoene 
dehydrogenase proteins . 


BL00982A IB. 41 7.750e- 
17 282-314 


1656 


BL00741 

- 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. j 


BL00741B 14.27 1.391e- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.93Be- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2,51 8.889e- 
10 442-455 


1659 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
46B 


1660 


BL00406 


Act ins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1*61 


PR00105 


CYTOSINB- SPECIFIC DNA 
METHYLTRANS FERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 12S9- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24. ell 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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SEQ ID NO:~ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-hand calciura-binding 
domain proteins. 


BL00018 7.41 5."050e-10 
489-502 


1667 


POO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8.500e- " 
38 7-46 


1669 


BL01153 


N0Ll/N0P2/sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 115-141 BL01153C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- 
10 1146-1169 


1672 


BL00596 


Chromo domain proteins . 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTP1/0BG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR0G049D 0.00 7.580e- 
11 343-3S8 PRD0O49D 
0.00 1.286e-10 342-357 


1676 


PR00747 


GLYCOSYL riYDROLASS 
FAMILY 47 SIGNATURE 


PRO0747H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7 500e>-1B 117-1^1 
PR0074 7A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
B.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PR00747 


GLYCOSYL HYDROLASE" 
FAMILY 47 SIGNATURE 


PC fin 74 7H 13 1C a ciCfl ' 
19309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 19 OR 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8 . 714e- 
10 193-210 


1680 


"BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL0067E 9.67 
6.684e-09 320-331 


1681 


BL00578 


Trp-Asp <WD) repeat 
proteins proteins. 


BL00678 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PRO 064 61 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PROO&^H 6". 32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 
10 420-435 


1692 


P ££0045(5 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456B 
3.06 7.281e-10 488-503 
PR00456E 3.06 B.125e- 
10 489-504 


1693 


BL00674 


AAA-protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-2^3 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00O28 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL0O028 16.07 S.SOOe- 
11 227-244 BL00028 
16.07 1.600e-10 199- 
215 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.5^5e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PRO0O19B 11.36 
7.120e-09 204-218 


1711 


BLOliS* 


WW/rspS/WWP domain 
proteins. 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5,408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16 .03 7.000e- 
10 187-203 


1713 


PF00642 


Zinc finger C-xB-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11. S9 9.550e- 
11 230-241 


1715 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BL00038 


Myc-type, 'helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 6.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


PROTBIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-428 


"1724 


BIi01279 


Protein -L- 
isoaspartate (D- 
aapartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calciun-binding 
domain proteins. 


BL0001B 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesln light chain 
repeat proteins . 


BL01160B 19.54 9.6i4g- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family . 


PFfiOft*iflP lC 7(1 A "1/1 Qo^ 

rruuosur X-J . 'U *4.j*t!?e- 
22 246-279 PF00850D 
14 76 6 B^OpOO 177- 
201 PF00850E 8.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL003 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C S 61 5 Q"*?*»- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-502 


1743 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE . 


DPrtfld^ q& n on t iqqa 
rivv/ufi yt\ u . zu x.xooe— 

11 5-27 PR00449D 

10.79 2.241e-10 109- 

9.289e-10 144-167 \ 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PR00449D 
10.79 2.24le-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL00720 


uUd.ll J. I IK -ItUClcOt 1Q6 

dissociation stimulators 
CDC25 familv siem 


BuuU/2UB 16.57 8.297e- 
15 136-160 . 


1746 


PR00081 


GLUCOSE/RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acyl transferases 
ChoActase / COT / CPT 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
STRNATT'TRR 

O X vJJLi jr\ x \J X\ r> 


PR00819B 10.83 7.158e- 
1 1' 4 - 2 0 


1751 


PDO0O66 


PROTEIN ZINC-FINGER 
METAL- BINDI . 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 

A J OX- IH fUUUUOO 

13.92 6.571e-12 117- 
13 0 


1753 


BL01013 


V/Ajr o C i_ yj _L Ui J lUH Ay 

protein family proteins. 


18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


DJJUU / 4&U.UA 6 . J7JC 

09 490-521 BL00790I 
20 01 2 B21e-0°r 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER M3TAL- 
BINDING NU. 


PD01066 19.43 9.7SDe- 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e- 
09 224-278 ' 


1745 


PR00326 


GTP1/ODG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BLO0S42 1 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- " 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
KO. 


DESCRIPTION 


RESULTS* 


1778 


BL00084 


Copper type II, 
ascorbate -dependent 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 

o.lj^e-lo 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BLO1013 


Oxysterol -binding 
protein family proteins. 


DUU1U1JU CQiSl ,5. / DoC — 

18 611-655 BL01013A 
25.14 2.891e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL0O?4lB 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign . 


BL00741B 14.27 8.138e- 
13 492-515 



♦results include in order: accession number subtype; raw score; p-value; postion of 
signature in amino acid sequence. 
TRADOCS: 14 16223. 1(%CRJ0I I.DOC) 
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TABLE 4 



SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-vaiuB 


PFAM 
crwoi? 


2 




Immunoglobulin domain 


2 . le-32 


i e 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 3e-29 


JL.Lv . / 


4 


ZI-C2H2 


Zinc finger, C2N2 type 


1.6e-21 


84.9 


5 


fn3 


Fibronectin type III domain 


0 


ivj f . X 


I 


£n3 


Fibronectin type III domain 


0 


1035.0 


7 


fn3 


Fibronectin type III domain 


o 


1090 » 4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e- 40 


146 . 7 


10 


P 450 


Cytochrome P450 


9.5e-17 


62.0 


12 


ank 


Ank repeat 


6e-20 


79 . 7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


15 


zf-MYND 


MYND finapr " 


1 . 3e-06 


35.4 




zf-MYND 


" MYND fincrpr 

Mill LJ ^dJlUCX, 


1 . 3e-06 


35.4 


17 


z£ -C2H2 


7inr f i nnpr C~> VT "5 t- ima 
«-< x i ij.iiycir / v.«ri£ cype 


1 . 7e-99 


343 .9 


18 


CAP GLY 


CAP-Gly domain 


1.2e-25 


98.7 


20 


IMPDH C 


IMP dehydrogenase / GMP 
reductase C terminus 


1.6B-119 


410.5 


21 


TMPDH C 


IMP dehydrogenase / GMP 
reductase C terminus 


4.3e-102 


352.6 


22 




Eukaryotic protein kinase 

uuuiain 


2 .4e-79 


277.0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8.4e-74 


258.6 


"25 


RNA nol A 


RNA polymerase alpha subunit 


0 


1077.7 


26 


Cla 


Clg domain 


1.9e-10 


44.4 


"27 


3 


*vj.oosoinax protein l*23 


7.8e-32 


111.2 


28 


Ribosoraal 1.2 
3 


(\iijuauindi ptOtEin Luiji 


le-29 


104 . 2 


30 


zf-A20 


A20-like zinr f "i nnf»T- 


1 . 5e-10 


48 . 5 


31 


zf-A20 


A20-like zinc linger 


1.5e-10 


48.5 


32 


FMN_dh 


FMN- dependent* rir> huHrnncn n r*c* 


5 . 4e-179 


608 . 1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3 .8e~59 


209.9 


35 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


Immunoalobul in dnma s n 


1 . 4e-13 


48.8 


40 


kinesin 




6 . 7e-7b 


265 . 6 


44 


Ets 


Ets- domain 


1 . 4e-56 


182 . 1 


45 


EtS 


Ets-domain 


1.4e-56 


182.1 


46 


LRR 


Leucine Rich Rpnpat- " "" 
uwuuxn^ i\ nebcin. 


1 . 7e-l3 


58 .3 


48 


zf-C2H2 


Zinc finger, C2H2 type 


2 . 36-162 


552 . 8 


49 


IT AM 


ImmunoreceDtor hvrnni no-hano^ 
activation mot 


1 . 4e- 05 


31.9 


50 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 




1 . le- 26 


102 . 0 


51 


UCH-2 


Ubiquitin carboxyl -terminal 
hydrolase family 


l.le-26 


102.0 


52 


ras 


Ras family 


8 . 5e-45 


162,3 


53 "~ | 


PRK 


Phosphor ibu 1 ok i na s e 


2.1e-65 


230.7 


54 


myb_DNA- 
bindlng 


Myb-like DNA-binding domain 


0 . 096 


"ic 9 


55 


voltage_CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


ougar_tr 


Sugar {and other) transporter 


0.00015 


-64.3 


57 1 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-2S 


96.3 


59 


ank 


Ank repeat 


5.9e-25 


96.3 


67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 1 


7.9e-S4 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 


ig 


Immunoglobulin domain 


8.2e-28 


94 .7 


73 j 


pkinase 


eukaryotic protein kinase 


8e-69 j 


242.1 
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1 SBQ ID 

NO: 


PFAM NAME 


"description 

domain 


p -value 


PFAM 
SCORE 


74 


pkinase 


Etikaryotic protein kinase 
domain 


2.8e-3B 


140.6 - 


76 

["83 
84 
8* 

- 88 


zt- 

C4_Topoisom 
fn3 

SH2 

ig 


Topoisomerase DNA binding C4 
zinc fing 

Prolyl oligopeptidase family 

F ibronectin type III domain 

Src homology domain 2 " 
Immunoglobulin domain 


5.4e-54 

4 .3e-10 
4.1e-51 
3.1e-22 


192.8 

36.8 

183.2 

67.7 


09 
| 92 

1 93 

1 QC 
1 73 


WD40 

laminin_G 

AMP-binding 

pkinase 


WD domain, G-beta repeat 
Laminin G domain 
AMP -binding enzyme 
aukaryotic protein kinase 
domain 


0 . 0091 
2.1e-21 
6" .le-2? 
2.4e-13 
1.4e-59 


14 . 0 

84.6 

98.5 

-37.2 

211.4 


J 97 
1 98 


pkinase ~ 

adh short 
k i neein 


Eukaryotic protein kinase 
domain 

short chain dehydrogenase 
Kinesin motor domain 


2.6e-5l 
2e-6l 


183.9 
217.5 


X01 
102 


IRS 
AAA 


PTB domain (IRS-1 type) 

ATPases associated with various 

cellular act 


2.2e-86 
3 . %e- 36 
6 . 8e-05 


300.4 
133 .0 
" -5.2 


104 
Kofi 

1 1 n -7 — 


pkinase 
ras 


Eukaryotic protein kinaqr* 

domain 

Ras family 


2 . 7e- 73 
8.3e-24 


256.9 
92.5 


l 1U / 

108 
1 109 


FYVE 

Cyt_reductas 
e 

2X -C2H2 


FYVE zinc finger 

FAD/NAD- binding Cytochrome 

reductase 

Zinc finger, C2H2 type 


5.4e-27 
7.7e-6'l 


100.7 
215.5 


113 
J 116 


pkinase 
PH 


Eukaryotic protein kinase 

domain 

PH domain 


2.3e-122 
4e-88 


420.0 
306.2 ~~ 


117 


iipocalin 


LiPOCalin / cvf nnnl -i r> fat-hu 

acid binding pr 


3.le-ll 
2 . 4e^l4 


45.2 
53 .5 


118 
1 120 


pkinase 
WD40 


Eukaryotic protein kinase 
domain 

WD domain, G-beta repeat 


4 . 5e-20 
2.4e-14 


76.3 
61.1 


1 121 
i i.£ j 

124 
| 127 


WD40 

IF5_eIF4 elF 
2 

ralto_carr 


wd domain, G-beta repeat 
elF4-gamma/eIF5/eI?2-epsilon 

Immunoglobulin domain 
Mitochondrial carrier proteins 


2 .4e-14 
le-32 

6.5e-08 


61.1 
122.2 

36.6" 


{ 128 
12^ 

13 0 
("133 


PP2C 

ATP1G1 PLM M 

AT 8 

pfkB 

ACBP 


protein phosphatase 2C 
ATP1G1/PLM/MAT8 family 

pfkB family carbohydrate kinase 
Acyl CoA binding protein 


3e-l6 
2 .2e-71 
3 .le-20 

4 .5e-42 
4 .6e-22 


58.6 

250.6 

60.6 

137.1 
86.7 


1 134 
| 135 
136 


rrra 
IQ 

ATP1G1 PLM M 
AT8 


RNA recognition motif, 

IQ calmodulin- binding motif 

ATP1G1/PLM/MAT8 family 


1.2e-31 
e. , oe - Uo 
9.3e-22 


118.5 
41 . 0 
85.7 


139 
14 0 


WH2 

zf-C2H2 — 


wisicott Aldrich syndrome 
homology region 2 
Zinc finger, C2H2 type 


0.0067 


23 .1 


141 
143 

hue 1 


Peptidase S2 
6 

IT? 

KRAB 


Signal peptidase I 

ADP-ribosylation £ actor family 
KRAB box 


1.7e-82 
5.7e-10 

1.2e-39 


287.5 
35\7 

145.2 


148 
149 

f 151 


DUF6 
PDEase 

S4 


Integral membrane protein DUF6 
5 '5' -cyclic nucleotide 
phosphodiesterase 
S4 domain 


7.3e-30 

3.096 

3.8e-80 

l.le-08 


112.6 
8 .0 
231.1 

32.3 


153 
154 

155 j 
157 i 


tRNA-synt_ld 
wyt_reductas 

cas ] 
ictin ; 


tRNA synthetases class I (R) 
PAD/NAD-binding Cytochrome 
reductase 

*as family 1 " 
tetin ~ : 


3.8e-103 
7.8e-60 ; 

J.6e-28 3 
l.8e-26 1 


J56.1 
212.2 

107.0 
17. 1 



247 



SEQ ID 
NO: 


"T>FAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


158 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24 .9 


160 


Zn_carbOpept 


Zinc carboxypeptidase 


5e-138 


471 .9 


165 


pkinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236.1 


167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27.0 


168 


Ribosomal_sl 
5 


Ribosomal protein S15 


l.le-06 


29.0 


16"9 


DEAD 


DEAD/DEAH box helicase 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUF59 


0.07 


-17.4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58.6 


173 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7.3e-06 


32 .9 


175 


ras 


Ras family 


le-31 


118.8 


178 


ATP1G1_PLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


2.5e-17 


71.0 


179 


2f-C2H2 


Zinc finger, C2H2 type 


l.Se-99 


344 .2 


180 


Clq 


Clq domain 


8.8e-72 


251.9 


190 


Y_phosphatas 
e 


Protein-tyrosine phosphatase 


4 .9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-l6 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


G.5e-82 


285.6 


194 


bromodomain 


Bromodomain 


5.8e-31 


111 .4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.5e-W 


227.1 


19? 


DnaJ 


DnaJ domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


200 


acld_phospha 
t 


Histidine acid phosphatase 


2.5e-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


204 


VATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 


205 


vATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


1.6e-139 


476.9 


20* 


ldl_recept_a 


Low- density lipoprotein 
receptor domain 


2.4e-25 


97.6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ con 


Ubiqui tin -conjugating enzyme 


7.4ct74 


258.8 


213 


UQ_con 


Ubiqui tin -conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1.8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83.4 


218. 


Glycos trans 
f 2 


Glycosyl transferases 


4e-2i 


83 .6 


219 


ig 


Immunoglobulin domain 


0. 092 


10.7 


222 


WD4 0 


WD domain, G-beta repeat | 


7.4e-23 


89 .4 


224 


TPR 


TPR Domain 


1.2e-08 


42.1 


225 


DnaJ__CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


l.Se-38 


141. * 




HSP7 0 


Hsp70 protein 


2 .4e-54 


194 .0 


230 


GSHPx 


Glutathione peroxidases 


3.4e-47 


170.2 


231 


tsp_l 


Thrombospondin type l domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4.6e-144 


492.0 


234" 


ras 


Ras family 


4.8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6 .7e-29 


109.4 | 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


244 


dCMP_cyt_dea 
m 


Cytidine and deoxycy tidy late 
deaminase 


2 .5e-0S 


31 . 1 


245 




Immunoglobulin domain 


6.7e-08 


30.5 


248 


writ 


wnt family of developmental 
signaling protei 


9.1e-270 


742 . 6 


250 


mito_carr 


Mitochondrial carrier proteins 


1.3e-5$ 


193 .6 


254 


adenylatekin 
ase 


Adenylate kinase 


1.8e-14 


55 . 7 


255 


Cation_ef f lu 

X 


Cation efflux family 


2.8e-33 


124.0 " 


256 


SK3 


SH3 domain 


3 . 9e-14 


60.4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2. 6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2.1e-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 
Q 


PQQ enzyme repeat 


1.6e-15 


65.0 


262 


proteasome 


Proteasome A- type and B-type 


6 .Se-64 


225 .7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 .3e-27 


101 . 0 


270 


filament 


Intermediate filament proteins 


3 . 2e-150 


512 . 5 


~271 


Cholinejcina 
se 


Cholme/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal S7 


Ribosomal protein S7p/s5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 3e-77 


269 . 9 


280 


WD4 0 


WD domain, G-beta repeat 


7 , 8e-73 


255 .4 


261 


WD40 


WD domain, G-beta repeat 


7 . Se-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4 . 6e-24 


93 . 4 


287 


Exonuclease 


Exonuclease 


1 .4e-67 


238 .0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11 . 2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11 . 2 


294 


2f-C2H2 


Zinc finger, C2H2 type 


1.4e-29 


111.7 


2$5 


2f-C2H2 


Zinc finger, C2H2 type 


2 .2e-12S 


430.0 




mi to carr 


Mitochondrial carrier proteins 


4 . le-59 


205.5 


297 


" HMG_box ~~ 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos trans 
f 4 


Glycosyl transferase 


5e- 87 


302 5" 


304 


tRNA-synt_2 


tRNA synthetases class II (D, K 
and N) 


1 .le-B4 


294.8 


305 


KRAB 


KRAB box 


2e-44 


161. 0 


306 


rrm 


RNA recognition motif. 


2 ,7e-44 


160 . 6 


308 


7tm_l 


i transmembrane receptor 
(rhodopsin family) 


5 .2e-39 


I26\i "■ 


309 


DNA_polymera 
seX 


DNA polymerase X iamily 


2.4e-64 


227.2 


311 


F-box 


F-box domain. J 


9.5e-08 


39.2 


312 


ig 


Immunoglobulin domain 


6.8e-19 


65 . 9 


313 


Ets 


Ets-domain 


8.1e-60 


192.3 


315 


Kelch 


Kelch motif 


1.3e-l06 


367 . 6 


317 


arf * 


ADP-ribosylation factor family j 


3 .2e-35 


130.4 i 


318 


sugar_tr 


Sugar (and other) transporter i 


0.0003 


-73.1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288 . 6 


"322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282.6 


324 


Xlink 


Extracellular link domain 


4 .5e-143 


331. S 


326 


ARID 


ARID DNA binding domain 


S.la-37 


136 .4 


327 


HMG box 


HMG (high mobility group) box 


6.7e-29 


109.4 


328 


cadherin 


Cadherin domain 


8.1e-81 


281.9 | 


331 


chromo 


'cnromo' (CHRromatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase M2 
2 


Glycoprotease family 


1.2e-136 


467.4 
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SEQ ID 
NO: 


" PFAM WANE 


DESCRIPTION 




PFAM 
SCORE 


335 


vva 


von Willebrand factor type A 
domain 


2 . 3e-07 




339 


ras 


Ras family 


7 . 8e-07 


-59 . 1 


340 


zf-C2H2 


Zinc finger, C2H2 type 


8 . 2e-64 


225 .4 


342 


zt-C2H2 


Zinc finger, C2H2 type 


" 2.4e-8S 


297 . 0 


343 


*9 


Immunoglobulin domain 


0 . 0005 


18 - 0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6 . 5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6\5e-6S 


229.1 


3&1 


EGF 


EGF -like domain 


8.5e-20 


79.2 


352 


ank 


Ank repeat 


2 . 5e-101 


350 .0 


354 


TBC 


TBC domain 


5 . le-15 


63 . 3 


355 


PHD 


PHD- finger " 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0 . 033 


15 . 8 


359 


zf-C2H2 


Zinc finger, C2H2 type 


7 . 4e-20 


79 . 4 


361 


ank 


Ank repeat 


' 6.6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 . 7e-53 


189.7 


363 


efhand 


EF hand 


5 ,4e-10 


4 6.6 


367 


LRR • 


Leucine Rich Repeat 


8 . 8e-44 


158 . 9 


368 


larainin G 


Laminin G domain 


1 .5e-33 


121.7 


369 


PP2C 


Protein phosphatase 2C 


5 • 3e-20 


73 . 9 


372 


LIM 


LIM domain containing proteins 


9 » 9e - 15 


57 . 1 


3 73 


KRAB 


KRAB box 


4 8e~23 


90 . 0 . 


3 76 


. ion_ trans 


Ion transport protein 


2 . 9e- 09 


-4.2 


377 


Beach 


Beige /BEACH domain 




704 . 5 


380 


pkinase 


Eukaryot ic protein kinase 
domain 


1 . 6e- 94 


3 27.5 


381 ~ " 


AMP-bin6*ing 


AMP-binding enzyme 


1 4e-07 


_ 

-14 0.3 


382 


HECT 


HECT-domain (ubiquitin- 
transferase) . 


1.3e-07 


-13 .5 


384 


ank 


Ank repeat 


2 . 5e- 101 


J b U . U 


386 




Immunoglobulin domain 


9 . 5e- 06 


23 . 6 


3 88 


zf -C2H2 


Zinc finger, C2H2 tyue 




154.6 


389 


ig 


Immunoglobulin domain 


2 , 8 e - 1 5 




390 


mito_carr 


Mitochondrial carrier proteins 


3 . 5e-67 




392 


TPR 


TPR Domain 1 


6 . le-17 


69 7 


393 


SH3 


SH3 domain 


3 . 5e- 09 


43.9 


394 


AAA 


ATPases associated with various 
cellular act 


4 . le-2l 


83 6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


23 7 , 3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0 .0066 


23 . 1 


399 


fn3 


Fibronectin type III domain 


4.1e-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0 . 00049 


26 . 8 


401 


El_dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


402 


rn3 


Fibronectin type III domain 


0 


1719.6 ~ 


404 


LRR 


Leucine Rich Repeat 


2 . le-10 


4 8.0 


405 


cadherin 


Cadherin domain 


8 . ie-81 


281 . 9 


406 


Zf-CXXC 


CXXC zinc finger 


5e-15 


63 .4 


410 


RhoGEF 


RhoGEF domain 


l.le-23 


92 .1 


411 


F-box 


F-box domain. 


4.2e-06 


33 . 7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


5.8e-16 


61.6 


415 


CPSase L. cha 
in 


Carbamoyl-phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3.8e-24 


93.6 


419 


DENN ~~ 


DENN (AEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8 .le-43 


1&5.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G -patch 


G-patch domain 1 


le-19 


78.9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24.6 


427 


Plexin_repca 


Plexin repeat 


0.0023 


24.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 




t 








429 
431 


zf-C3HC4 
DEAD 


Zinc finger, C3HC4 type (RING " 
finger) 

DEAD/DEAH box he li case 


8.6e-ll 


39.2 


432 
433 


SH3 

GTP CDC 


SH3 domain 

Cell division protein 


le-66 
3.4e-16 


214.6 
67.2 


436 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.1e-114 
4 . 5e-194 


393.5 
658 .1 


438 


Ricin B lect 
in 


Similarity to lectin domain of 


0.0085 


10.5 


. 441 


" Alpha adapti 


Alpha adapt an carboxyl- terminal 


l-2e-256 


866.0 


442 


Alpha adapti 
n_C 


Alpha adaptin carboxyl- terminal 
dotnai 


1.8e-235 


795.7 


443 


PD2 


PDZ domain (Also Jcnown as DHR 
or GLGF) . 


1.9e-65 


230.9 


445 
446 


LON 

ig 


ATP - dependent protease La {LON) 
domain 

T mmi inArt l r\ V^i s 1 ^ n ^ams 4 w% 


0. 00012 


-17. 1 


,451 
452 


sushi 
" " £n3 


-Ltiuuui xujjuxxn uomain 
Sushi domain (SCR repeat) 
Fibronectm type III domain 


0. 00011 
1 . 4e-18 


20.1 
75.2 


454 
456 


pyridoxal de 
C 

kinesin ~ ~*— 


Pyridoxal -dependent 
decarboxylase conse 
Kinesin motor domain 


1 . 5e-06 
8 . 3e-14 


35.2 
50 . 3 


457 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4 . 9e-217 
le-175 


734.4 
597.1 


458 
468 


Josephin 
bZIP 


Josephin 

bZIP transcription factor 


0 . 0002 


18.7 ~ 


470 


NTP_transfer 
ase 


Nucleotidyl transferase 


1.7e-07 
6 .3e-06 


31.8 
_ 2 6 . 3" 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 " 


473 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


477 
473 


zf-RanBP 
WD40 


Zn- finger in Ran binding 

protein and others. 

WD domain, G-beta repeat 


0. 028 


21.0 


480 


KRAB 


KRAB box 


6.56-18 
le-31 


73.0 
118.8 


481 


ArfGap 


Putative GTP -ase activating 
protein for Arf 


8 .4e-66 


232 . 0 


| 485 


SH2 


Src homology domain 2 


0.011 


11.4 "~ 


466 


Clq 


Clq domain 


4 .36-74 


259. 6 


487 


<3srm 


Double-stranded RNA binding 
mot if 


1 . le-47 


171 . 9 


489 


z^-C2H2 


Zinc finger, C2K2 type 


4.86-153 


521.9 — 


490 


Alpha_adapt i 
n C 


Alpha adaptin carboxyl- terminal 
domai 


3.4e-222 


751. £ 


492 


SKI 


Shikimate kinase 


1.2e-10 


48.6 


497 


ENVjpolyprot 
ein 


ENV polyprotein (coat 
polyp rotein) 


2.6e-22 


77 . 6 J 


498 


abhydrolase_ 
2 


Phospholipase/Carboxylest erase 


0.041 


-48.1 


500 


rrm 


RNA recognition motif. 


5.4e-34 


126.4 


501 


WW 


WW domain 


4.6e-18 


73.4 


502 


ig 


Immunoglobulin domain 


1 .16-10 


39.5 


504 


aohyOrolase 


alpha/beta hydrolase told 


0.045 


-3.6" 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219.0 


508 


Na K ATPase 
C 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 




Exonuclease 


1.3e-56 


201.5 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


511 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
fJL 


Slycosyl transferases group 1 


1.9Q-09 


38.5 


"514 ] 


pro isomeras 

a I 


^yclophilin type peptidyi- 
arolyl cis-tr 


L.8e-63 


221.4 " 
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SEQ ID 
NO: 


" PPAM NAME 


DESCRIPTION 


p— value 


PKAM 
SCORE 


515 


EGF 


EGF-like domain 


1 - 9e - 1 8 


74 . 7 


516 


Surp 


Surp module 


4 ,3e-38 


140 . 0 


523 


19 


Immunoglobulin domain 


3.3e-06 


25 .0 


526 


UBX 


UBX domain 


l . le-34 


128 . 6 


528 


adh_zinc 


Zinc-binding dehydrogenases 


2.7e-34 


127 . 4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.04 6 


10.0 


531 


adh short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mito carr 


Mitochondrial carrier proteins 


2. 5e-8I- 


281.7 '"" 


533 


mito carr 


Mitochondrial carrier proteins 


2e-6l 


213.5 


5*34 


thiolase 


Thiolase 


3 . 5e- 183 


622 . 0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153.7 


536 


" SCAN — — 


SCAN domain 


4e-55 


JL70 . O 


537 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-l36 




538 " 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 






539 


tRNA-synt_l 


tRNA synthetases class I tt, h, 
M and V) 


1 9e- 117 




540 


tRNA-eynt_l 


tRNA synthetases class I (I, L, 
M and V) 


3 . le-136 


466 0 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-B5 


295 . 7 


543 


Zf-C2H2 


Zinc finger, C2H2 type 


55^g 


242.6 


"544 


dKjfioi 


Protein of unknown function 
DUF101 


8.5e-38 


139.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD40 


WD domain, G-beta repeat 


2 . 6e- 32 


120 . 8 


548 


RHD 


Rel homology domain (RHD) 


• i . 6e-23 8 


6B6 . 2 


549 


MMR HSR1 


GTPase of unknown function 


5.4e-67 


236.0 


551 


HECT 


HECT — doma \ n fuHi rnH n — 
transferase) . 


4 . 3e-127 


435 . 6 


"554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


J . JC IH 


259 . 8 


555 


zf-UBRl 


Putative zinc finger in N- 
recognin 


3 .3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP-binding 


AMP-binding enzyme 


2 8e-06 


■ibj , / 


562 


PABP 


Poly- adenylate binding protein/ 
unique domai 


4 „ 9e-36 


13 9.8 


"554 


Gag_p3 0 


Gag P30 core shell protein 


1 . 2e-67 


238.2 


566 


PWWP 


PWWP domain 


8 . ie-16 


66 . 0 


567 


SCAN 


SCAN domain 


7. 3e-68 


238.9" 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1 .5e-84 


294 .3 


570 


pkinase 


Bukaryotic protein kinase 
domain 


1 . 5e-84 


294 .3 


571 


CN_hydrolase 


Carbon- nitrogen hydrolase 


0 .00081 


-79.7 


572 


myosin_jhead 


Myosin head (motor domain) 


0 


1495 .2 


573 


myosin_head 


Myosin head (motor domain) 


0 


14 90 .4 


575 


Surp 


Surp module 


1 .7e-23 


91 . 5 


576 


Surp 


Surp module 


1.7e-23 


91 . 5 


577 


DNAjpolJB 


DNA polymerase family B 


0 


1138 .6 


" S78 


PDZ 


PDZ domain (Also known as DHR 
or GIX3F) . 


8 .3e-09 


42.7 


579 


LRR 


Leucine Rich Repeat 


4 .9e-2l 


83,3 


580 


neur_chan 


Neurotransmitter-gated ion- ■ 
channel 


5.9e-177 


601.3 


"5B3 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


DEAD/ DE AH box.helicase 


7.3e-36 


116.3 


586 


KH- domain 


KH domain 


2.9e-13 


57.5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


LIM 


HM domain containing proteins 


2.3e-36 


133 .4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114 . 7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114.7 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p— value 


PFAM 
SCORE 


592 


hormone_rec 


Ligand- binding domain of 
nuclear hormone 


3.5e-22 


87.1 


593 


PHD 


PHD- finger 


3.8e-12 


53.8 


S94 


cadherin 


Cadherin domain 


4 .2e-99 


342 . 7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319 . 2 


597 


WD40 


WD domain, G-beta repeat 


0.00054 


26 .7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262 .9 


602 


G_Adap t_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 3e-86 


300 .4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152 .4 


606 


mito carr 


Mitochondrial carrier proteins 


g m 3e-67 


232.3 


608 


PWWP 


PWWP domain 


2 . 6e-28 


107 . 5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP GLY 


CAP-Gly domain 


0 . 0046 




615 


RFX_DNA_bind 
ing 


RFX DNA-binding domain 


5.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


i io.ni 


284 8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


278 .5 


618 


2 f - C3HC4 


«i.nc linger, i*jnt_<* type 
finger) 


0 . 0098 


13 . 1 


620 


MATH 


MATH domain 


7.Be-05 


22 .2 


621 


Q 


Protein- tyrosine phosphatase 


1 . 4e-32 


121 . 6 


622 




ouA.dtyu tit, jjiocein Kinase 


4 . 4e-40 


146 . 6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


molybdopteri 
n 


ri^axyuLiu iiiuxyijmjp c e r in 
oxidoreductas 


1 . 4e— 12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627" 


cNMP binding 


domain 


3 . 7e-58 


206.6 


630 


adh short 


short chain d6hydrogenase 


5e- 17 


70 0 


631" 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307 .1 


632 


rrm 






30.5 


635 


pkinase 


Eukaryotic protein kinase 
doma in 


1.6e-104 


360.7 


636 


Fork head 


Fo*"k hpad rlnma i n 


p . ye** * / 




637 


pkinase 


Eulcarvot" i n T)~rn t-p i n If i naco 

domain 


3 . He- / \J 


•*} a c a 


642 


TPR 


TPR Domain 


4 .8e-08 


40.1 


643 


ef hand 


EF hand 


1 . 3C"£ / 


104 . 6 


647 


SNF2_N 


SNF2 and others N- terminal 
domain 


1 . 2e-101 




64 8 


PeeudoU synt 
h_2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0 . 0087 


22 . 7 


651 


ank 


Ank repeat 


1.3e-i7 


7l 9 9 


652 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4 . le-171 


581 . 8 


654 


tsp_l 


Thrombospondin type l domain 


4 .le-47 


169 . 9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 


5 .3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


76 .2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


£67 


GST 


Glutathione S-transf erases. 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat j 


4e-57 


203 .2 


671 


I_LWEQ 


I/LWEQ domain j 


9.5e-101 


341.0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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SEQ ID 

NO: 


PFAM NAME 


~ DESCRIPTION 


p-value 


PFAM 
SCORE 


675 


WD40 


WD domain, G-beta repeat 


4 . 8e-24 


93 . 3 


576 


LRR 


Leucine Rich Repeat 


0.0015 


25 . 2 


679 


zt-eccH 


Zinc finger C-xB-C-x5-C-x3-H 
type 


2.6e-29 


107.7 


680 


z£-C2H2 


Zinc finger, C2H2 type 


5.2e-05 


30 .1 


681 


CH 


Calponin homology (CH) domain 


2 .4e-17 


71.1 


682 


" DSPc 


Dual specif xcity phosphatase, 
catalytic doma 


4 .3e-43 


156 . 6 


683 


zt-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.051 


10 . 8 


687 


Synapsin 


Synapsin 


" 0 


1890.8 


689 


PRbb 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038.8 


691 


homeobox 


Homeobox domain 


8.5e-30 


112 .4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


2.6e-S9 


210.5 


697 


RhoGEF 


RhoGEF domain 


9.5e-35 


128.9 


698 


PHD 


PHD- finger — 


0.008 


9.3 


701 


zr-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


suitatase 


Sulfatase 


3e-231 


781.6 


703 


zf"-C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


1 - le-22 


88 . 8 


708 


WD4 0 


WD domain, G-beta repeat 


4 .8e-19 


76.7 


710 


Ran_BPl 


RanBPl domain. 


8 . 4e-06 


-7.3 


713 


DEAD 


DEAD/DEAH box helicase 


9.9e-42 


134.9 


714- ■ 


"PH 


PH domain 


J. . DC U3 


J? . u 


715 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


l ^p-i 1 ; 

A . 3C J / 


138.2 


"717 


Sialyltransf 


Sialyltransferase family 


7 . 5e-3i 


115 9 


718 




Immunoglobulin domain 


le-29 


100 . 8 


719 


integrin_B 


Integrins, beta chain 


o 




720 


zr-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase_C2 


Calpain family cysteine 
protease 


3e-145 


4 95 . 9 


723 


ig "1 


Immunoglobulin domain 


2 . 2e-05 


22.4 


724 


F-box 


F-box domain. \ 


.0. 007 


23.0 | 


725 


Nop 


Putative snoRNA binding domain 


8 . le-58 


205 5 


726 


Nop 


Putative snoRNA binding domain ' 


8 . le-58 


205 . 5 


727 


WD40 . 


WD domain, G-beta repeat 


7 .5e-26 


99.3 


730 


derm 


Double-etranded RNA binding 
motif 


0 . 027 


12.1 


731 


dynamin 


Dynamin family 


4 . 2e-l6 


66 . 9 


733 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2 .8e-10 


41.7 


"735 


CDP- 

OH P transf 


CDP- alcohol 

phosphatidyl transferase 


4 .2e-26 


100 . 1 


738 


DEAD 


uatMjf uEirtn dox nei lease 


8 .6e-57 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


6.5e-32 | 


119.5 


742 


ras 


Ras family 


2.2e-10O i 


346. 9 


743 


PMI_typeI 


Phosphomannosc i some rase type I 


1.2e-243 


822.9 


. 747 


trypsin 


Trypsin 


6 .4e-88 


279 . 4 


748 


kazal 


Kazal-type serine protease 
inhibitor domain 


2.2e-52 


187.4 


749 


et'hand 


EF hand 


6.3e-06 


33 . 1 1 


751 
752 


PHD . 
zf-C2H2 


PHD- finger 

zinc finger, C2H2 type 


4.9e-l6 


66.7 


753 




naioacid dehalogenase-like 
hydrolase 


3.2e-21 
6\le-ll 


83 .9 
49.8 


754 


Ribosomal L3 
9 


Ribosomal L39 protein 


0.00018 


26.7 


755 


PH " 


PH domain 


3.6e-14 


55.7 


758 


SCAN 


SCAN domain ~ " 


1.4e-53 


191.5 


759 
~7?0 


PA 


pa domain 


0.0065 


23.1 




art 


ADP-ribosylation factor family 


2.2e-l9 


77.8 


761 


CIDE-N 


CXDE-N domain ~~ 


2.2e-40 


147.6 
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SEQ ID 
NO: 


I PFAM NAME 


DESCRIPTION 


p- value 


PFAM 

£>L-UK1S 


762 


histone 


Core histons H2A/H2B>H3/H4 


9 . 9e- 53 


1 QQ C 

lot) . b 


763 


zt-MYND 


MYND finger 


4 . le-14 


DU. J 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le- 52 




767 


vwc 


von Willebrand factor type C 
domain 


2 . 9e-34 


127 .3 


769 


efhand 


EF hand 


4.8e-ll 


50.1 


770 


z£-C4 


Zinc finger, C4 type (two 
domains) 


2.4e-53 


181 . 6 


772 


ras 


Has family 


7e-90 


312 ."0 


773 


suiratase 


Sulfatase 


le-142 


"487.5 


775 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


z£-C2H2 


Zinc finger, C2H2 type 


l . le-i2 


33 . S> 


777 


zf-C2H2 


Zinc finger, C2H2 type 


1 . le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2 . le-32 


121 1 


779 


G6PD 


Glucose -6 -phosphate 
dehydrogenase 


1 . 5e-76 


-lie tz 


: 780 


spectrin 


Spectrin repeat 


3 . 7e-29 


iiU .J 


781 


mito carr 


Mitochondrial carrier proteins 




198 . S 


782 


SCAN 


SCAN domain 


1 . 3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GU3F) . 


a io.n7 
** . ic U / 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21.7 


786 


ras 


Ras family 


5 . 3 e-3 9 


143 . 0 


787 


RNase HII 


Ribonuclcase HII 


"5 ezn 
& . se-o ( 


237.1 


790 


PI3_PI4_kina 
se 


PhosDhatidvlinositol 3- unci a- — 
kinases 1 


5 . 4e-108 


372 . 2 


795 


cadherin 


Cadherin domain 


2 . 5e -4 0 


14 7.4 


796 


ARID 


ARID DNA binding domain 


l.£e,-20 


81. 6 


797 


trypsin 


Trypsin 


9 . 9e- 20 


64 . 8 


799 


CH 


CalDOnln homo] mv tCVf} Hnma ^ n 


3 . 7e-15 


63 . 8 


801 


Gal- 

bind lectin 


Vertebrate oalarto^l rto-hi nH-» r»«-r 
lectin 


4 . le-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26.1 " 


806 


TBC 


TBC domain 


1 . 8e- 26 


101 . 4 


807 


TBC 


TBC domain 




101.4 


808 


CN_hydrolase 


Carbon- nitrogen hydrolase 


8.8e-80 


278 . 5 


811 


CBFD .NFYB HM 
F 


Histone-liJce transcription 
factor 


fie - 14 


59 . 8 


812 


adh_ahort 


short chain dehydrogenase 




79 . 3 


814 


IMP4 


Domain o£ unknown function 




250 . 0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


H.2e-66 


232 . 1 


816 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 




1 1 Q Ct 

1 Jfl . 0 


817 


ARID 


ARID DNA binding domain 


2.5e-lB 


74 .3 


826 


IFS eIF4 elF 
2 


eIF4 - gamma/ el F5 /e I F2 - epsi 1 on 


1 . 6e-3 2 


121 . 5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 




131 • 2 


831 


LRR 


Leucine Rich Repeat 


2 . le-26 


AUX . X 


832 


iamininjSGF 


Laminin EGF-like (Domains III 
and V) 


2e-57 


70A O 


839 


rrm 


RNA recognition motif. 


1 .3e-22 


aS — P 

OO . 3 


840 


Y_j)hosphatas 
e 


Protein- tyrosine phosphatase 


2.60-119 


409!8 


841 


pkinase 


Eukaryotic protein kinase 
domain 






346.3 


844 


Ribosomal L2 
2e 


Ribosomai L22e protein family 


i'e-64 


228.4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


z£-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


ZE-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18 .9 


851 


SET 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 [ 
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SEQ ID 

NO: 


~PrAM NAME 


DESCRIPTION 
rich domain 


p-value 


PFAM 
SCORE 


953 


SRCR ~" 


Scavenger receptor cysteine - 
rich domain 


0 


1025.4 


857 


lactamase B 


Metal lo-oeta- lactamase 
superfamily 


' 0.012 


-6.0 


858 

B59 
861 
863 
864 


COXfctA 

rrm 
" PRK 

mito carr 
HSP90 


Cytochrome c oxidase subunit 
Via 

RNA recognition motif. 
Phosphoribulokinase 
Mitochondrial carrier proteins 
Hsp90 protein 


" 3.46-58 

5.4e-45 
5.le-62 
" 2.9e-S3 
" 4.7e-lS8 


204.7 

162 .9 
219 4 
185.5 


866 - 
867 
872 
874 


1 9 

zf-C2H2 
histone 
(-PSase L cha 
in 


Immunoglobulin domain 
Zmc finger, C2H2 type 
Core histone H2A/H2B/H3/H4 
Carbamoyl -phosphate synthase 
(CPSase) 


4e-12 
7e-135 
4 .9e-41 
2.1e-218 


538.5 

44.1 

4^1.5 

149.8 

739.0 


879 


Ribosomal SI 
2e 


Ribosomal protein S12e 


2.1e-98 


340.3 


882 
"883 


serpin 
Patatin 


Serpins (serine protease 

inhibitors) 

Patatin 


2.5e-42 


145.7 


884 
887 


RA 

DUT92 


Ras association (RalGDS/AF-6) 
domain 

integral membrane protein DUF92 


1 .2e-51 
0.044 


182.0 
8.0 


889 


sugar_tr 


Sugar (and other) transporter 


2.7e-12 
8.2e-63 


54.3 
222.1 


893 


DUF28 


Domain of unknown function 
DUF2B 


1.3e-43 


158.3 


896 
898 


I P — trans 
DEAD 


Phosphatidylinositol transfer 
protein 

DEAD/DEAH box helicase 


6.5e-98 


338.7 


899 
900 
901 
902 
904 


KE2 
"KE2 
zf-C2H2 
ras 

X Jrlt 


KE2 family protein 
KE2 family protein 
Zinc finger, C2H2 type 
Ras family 
TPR Domain 


1 . 5e-48 

7e-6l 

4.3e-51 

2.7e-57 

2.3e-75 


15(> .5 
215 . 7 
183.2 
203.8 
263 .8 


906 
907 
908 
909 


GBP 
GBP 
WD40 
~pR 


Guanylate-binding protein 
Guanylate-binding protein 
wd domain, G-beta repeat 
PH domain 


3 .2e-22 
8 .9e-253 
1 . le-239 
2 .6e-26 
1 .3e-09 


87.2 

B53.1 

809.6 

100.8 

39.4 


910 


zi-C2H2 


Zinc finger, C2H2 type 


2.5e-39 


144 .1 


913 

921 
922 


Epiuterase 

TBC 
WD4 0 


NAD dependent 

epimerase/dehydratase family 
TBC domain 

WD domain, G-beta repeat 


5e-07 
1 . 5e-09 


-88.5 
30.7 


923 
924 

925 


WD40 

Hydrolase 
uy con 


WD domain, G-beta repeat 
haloacid dehalogenase-like 
hydrolase 

Ubi qui tin -conjugating enzyme 


1.6e-25 
8 . 2e-07 
2.9e-05 


98.2 
36.1 
29.1 


926 
928 


CH 
WD40 


caiponin homology (CH) domain 
WD domain, G-beta repeat 


0.00033 
3 .3e-53 


-27.6 
190.2 


929 


Z1-C3HC4 


zinc finger, C3HC4 type (RING 
finger) 


5 . 9e-48 
3 . le-10 


172 . 7 
37 r 4 


930 


Ribul_P_3_ep 
im 


Ribulose-phosphate 3 epimerase 
family 


7.2e-105 1 


361 .8 


931 
936 


Ribul_P_3_ep 

im 

C2 


Ribuiose-phosphate 3 epimerase 

family 

C2 domain 


1.2e-96 i 


334 . 4 


qi 7 


NAP_tamily 


wucieosome assembly protein 
(NAP) 


2.2e-62 
1 . le-22 


220.7 
84.6 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 ,2e-07 


25.1 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-75 


263.2 


949 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104.7 


950 


Acyl transfer 
ase 


Acyltransr erase 


1.6e-07 


38.4 
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SEQ ID 
NO: 


PPAM NAME 


" DESCRIPTION 


P- value 


PFAM 
SCORE 


951 


~ SAM " 


SAM domain (Sterile alpha 

motif) 


0.014 


14.5 


954 


GFO IDH MocA 


Oxidoreductase family 


l-.3e-ll 


52.0 


955 


Bts 


BTB/POZ domain ~ " 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


7e-22 


86.1 


957 


CDP- 

OH_P_transf 


CDP- alcohol 

phosphatidyl transferase 


0.053 


-22.2 


959 


ras 


Raa family 


2.4e-97 


336.8 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


Acetyl transf 


Acetyltransferase (GNAT) family 


1 .2e-08 


42.2 


962 


adh short 


short chain dehydrogenase 


2.4e-31 


117.6 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653.9 


970 


RNase PH 


3 ' exoribonuclease t'amily 


9e-24 


92.4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3.6e-21 


83.7 


978 


Ribosomal_Li 
7 


Ribosomal protein L17 


2.4e-20 


81.0 


979 


LtM 


LIM domain containing proteins 


5,8e-42 


152.8 J 


960 


Calsequestri 
n 


Calsequestrin 


1.7e-297 


1001.7 


982 


HSP20 


Hsp20/alpha crystallin family 


1.2e-10 


43.2 


983 


oxidored_q6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4.8e-63 


222.9 • 


988 


TBC 


TBC domain 


2.2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 


993 


tRNA_int_end 
o 


tRNA intron endonuclease 


0.0017 


-34 .2 


994 


homeobox 


Homeobox domain 


4e-18 


73.6 


997 


pyr_redox 


Pyridine nucleotide-disulphide 
oxidoreducta 


0.012 


11.6 


1000 


mito carr 


Mitochondrial carrier proteins 


9 .7e-123 


421 .2 


1001 


RA 


Ras association (RalGDS/AF-6) 
domain 


1.2e-15 


65.4 


1004 

_ 


• DUF81 


Domain of unknown function 
DUF81 


0.099 


10.2 


1005 


actin 


Actin 


1 .3e-174 


574 .3 


1006 


actin 


Actin 


3 .le-130 


428.6 


1007 


cpn60_TCPl 


TCP-l/cpn6 0 chaperonin family 


3 .7e-195 


661.8 


1008 


TPR 


TPR Domain 


6 .le-44 


159.0 


1009 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 - 


1012 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.7e-lS 


53.1 


1016 


tRNA-8ynt_2c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


101B 


RhoGAP 


RhoGAP domain 


1.6e-78 


274.3 


1022 


PGAM 


Phosphpglycerate mutase family 


3.8e-18 


69.7 


1026 


HMG_box 


HMG (high mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


.162.5 


1028 


UQ_con 


Ubiquit in- conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0.028 


TT.3 


1034 


Hydrolase 


haloacid dehalogenaac-like 
hydrolase 


2e-21 


84. 6 


1037 


KRAB 


KRAB box 


4.8e-06 


32.4 j 


"1038 


X 


Cation efflux family 


7.1e-45 


152.5 


1040 


ART 


NADiarginine ADP- 
ribosyl transf erase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


I.9e-18 


74.7 


1043 


Zt:-C2H2 


Zinc finger, C2H2 type 


3 .7e-24 


93 . 7 


1045 


lectin_c 


Lectin C- type domain { 


1.9e-28 


108 .0 


1046 


Glucosamine 
iao 


Glucosamine- 6 -phosphate 
isomerase 


0.00013 


-25.1 
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1 SEQ ID 
[ NO; 


PFAM NAT-IE 


DESCRIPTION 


p-value 


PFAM 
SCORE 


r j_Q47 

1 1049 


ligase-CoA 


CoA-iigases 


4.5e-80 


279.4 






Immunoglobulin domain 


1.7e-09 


35.6 


1050 
[ 1054 


Ribosonialjb2 


Rxbosomal protein L24e 


2e-33 


124.5 


1 1055 




Amidase 


4.3e-152 


518.7 






RNA recognition motif. 


""^.Be-ie 


100.3 


[ 1058 
1 1059 


annexin 


Annexin 


6.9e-44 


159.2 


1 1060 


v nPi 2__C 1 au di 




PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


J 1062 




Homeobox domain 


3.2e-3l 


117.2 


I 1064 


Acyl transfer 




o.oooe's" 


10.5 


I 1065 


jw f - dx n a 1 ng 


AMP-binding enzyme 


6.6e-100 


345.3 




T DD 


Leucine Rich Repeat 


3.3e-14 


60.6 


( 1066 
J 1071 


ulrJL U-Dlj 


GTPl/OBG family 


4.8e-41 


141.8 


1 1072 


*g 


Immunoglobulin domain 


B.4e-48 


159.1 


J 1074 


PHD 


PHD- finger 


6.8e-07 


36.3 




DENN 


DENN (AEX-3) domain 


6.3e-33 


121.5 


1075 


SCP 


SCP-like extracellular protein 


4 . /e-41 


149.8 


1 107 7 
1 1078 


OLF 


Olfactomedin-like domain 


2 ,2e-66 


234 .0 




mito carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1 1079 


WD4 0 


WD domain, G-beta repeat 


6 .2e-45 


162.7 


[ 1007 


START 


START domain 


1. 5e-48 


174 .7 


1093 
1 1094 


DSPc 


Dual specificity phosphatase, . 
catalytic doma 


"3.36-6"} 


223.4 




GSHPx 


Glutathione peroxidases 


9.6e-41 


148.8 


1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


264 .0 


1096 


! DUF2 5 


Domain of unknown function 
DUF25 


6e-75 


262 .4 


| 1105 


Nitroreducta 
se 


Nitroreductase family 


1.3e-13 


58.6 


1106 
1 1107 


PTE 


Phosphodiesterase family 


1 .3e-179 


dift.i 




DAGKc 


Diacylglycerol kinase catalytic 
domain 


0.00049 


19.6 


j 1109 


ras 


Ras family 


1.3e-15 


40. 7 


1115 


Art Gap 


Putatxve GTP-ase activating 
protein for Arf 


9 .7e-47 


168.7 


j 1116 


HMG14 17 


HMG14 and HMG17 


4 .4e-Zl 


83.5 


1 1117 


HMG14 17 


HMG14 and HMG17 


9.9e-12 


$2.4 


[ 1.119 


FAA_hydrolas 
e 


Fumarylacetoacetate (FAA) 
hydrolase fam 


2e-83 


290.6 


1 1120 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 


1 1123 


annydroiase 


alpha/beta hydrolase fold 


9.2e-23 


89.0 


1129 


pro_i some ras 
e 


Cyclophilin type pep tidy 1- 
prolyl cis-tr 


2.2e-56 


197.1 


j 1131 


DnaJ 


DnaJ domain 


1.6e-30 


114.9 


1 1132 


WD40 


WD domain, G-beta repeat 


l.3e-19 . 


78.6 


J 1133 
j 1134 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64 .9 




PH 


PH domain 


0,0015 


17.8 


j 11 Jb 

("i 117 '"" " 


Adap_comp_su 
b 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1 XXJ 1 

1 111 a" '" 1 "~ " 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708.8 


Ti*i 


ras 


Ras family 


1.5e-86 


301.0 




pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 
[ 1153 


Acyl transfer 
ase 


Acyl transferase 


1.2e-05 


29.9 




IRS 


PTB domain (IRS-l.type) 


5.4e-55 


196 .1 


) 1155 


*9T 


Immunoglobulin domain 


1.3e-31 


106.9 


1157 
| 1159 


Asparaginase , 
_2 


Asparaginase 


6.4e-72 


252 .3 




GMC_oxred ( 


SMC oxidoreductases 


4.7e-142 


485.3 


1160 


zf-ANl j 


Wl-like Zinc finger 


D. 00021 


27.9 
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SEQ ID 
NO : 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


linker_histo 
ne 


linker histone Hi and HS family 


3.8e-14 


"60.4 


1164 


DED 


Death ef rector domain 


3.9e-0S 


• 30.5 




IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain (IRs-i type) 


2.6e-43 


157.3 


1168 


SAM 


SAM domain (Sterile alpha 
motif) 


0.04 


10.5 " 


1170 


abhydrolaee 


alpha/beta hydrolase fold 


0.098 


-7.5 ] 


1174 


SAP 


SAP domain 


3.9e-10 


47.1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112.5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-3S 


129.9 - 


1180 


Ets 


Ets-domain 


1.8e-09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0.00016 


24.7 


1182 


TCL1_MTCP1 


TCL1/MTCP1 family 


9.5e-56 


198.6 1 


1184 


RasGEF 


RasGEF domain 


l-7e-88 


307.4 " 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR LY6 


u-PAR/Ly-6 domain 


O.0042 


15.6 


1188 


Orn_DAP_Arg_ 
dec 


Pyridoxal -dependent 
decarboxylase 


6.2e-128 


430.6 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1195 


Seel 


Seel family 


3 .2e-183 


622. 1 


1196 


pyr_redox 


Pyridine nucleotide-disulphide 
oxidoreducta 


3.1e-32 


lli.B " " 


1197 


Glyco transf 
8 


Giycosyl transferase family 8 


1.2C-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16 . 8 


1203 


adh_short 


short chain dehydrogenase 


8.3e-45 


162 .3 


1206 


Ubie_raethylt 
ran 


ubiE/COQ5 methyl transferase 
family 


1.3e-121 


417.4 


1208 


7tm_3 


7 transmembrane receptor 


7.2e-09 


29 .0 


1209 


ank 


Ank repeat 


3.9e-15- 


63.7 


1210 


VATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


ef hand 


EF hand 


3.2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2.1e-40 • 


147 .7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


i.5e-7i 


251.1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 


PX domain 


2.2e-15 


"64.5 


1233 


PX 


PX domain 


2.2e-15 


64.5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-09 


44 .0 


1241 


Peptidase_M2 
0 


Peptidase family M20/'M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17. 9 


1247 


UPF0 006 


Metalloenzyme of unknown 
function UPF0006 


6.3e-61 


215.8 


1248 


Giycos trans 
f_2 


Giycosyl transferases 


4.5e-10 


46.9 


1249 


efhand 


EF hand 


4e-ll 


50.4 j 


1254 


UQ_con 


Ubiquitin-conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


tormyl trans 
f 


Formyl transferase 


4.9e-30 


108.3 


1259 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46 .4 


1261 


DiHirolate re 
d 


Dihydrofolate reductase 


2.le-69 


241.7 


1262 


G_glu_transp 
ept 


Gamma -glutamyl transpeptidase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4.2e-22 


86.9 
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SEQ ID 

NO: 


" PFAM NAME " 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


Se-29 


108.0 


1267 


K_tetra 


K+ channel tetramerisation 
domain 


2 . 8e-27 


104 .0 


1269 


ras 


Ras family 


1.3e-85 


297 . 9 


1275 


z£-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4 .2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


69 .8 


1277 


abhydrolase 


alpha/beta hydrolase fold 


5.6e-21 


83.1 


1279 


trypsin 


Trypsin 


4 .4e-41 


132.0 


1286 


PBP 


Phosphatidylethanolaraine- 
binding protein 


1 .3e-13 


58. 7 


1285 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


""49.6 


1287 


ank 


Ank repeat 


1.7e-52 


187.8 


1294 


" £n3 


Fibronectin type III domain 


0.026 


20. 9 


1295 


GBP 


Guanylate -binding protein 


0.00026 


-70 .0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/claudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3 . 2e-14 


60 . 7 


1298 


LIM 


LIM domain containing proteins 


5.8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4 . 9e-43 


145 . 2 


1307 


mi to carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


1 . 6e-17 


71 . 6 


1310 


ttPAR LY6 


u- PAR/Ly- 6 domain 


7 le — 20 


75 . 5 


1313 


thiored 


Thioredoxin 


3 . 6e-05 




1314 


Aa_ trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


1316 


trypsin 


Trypsin 


4 . 4e-41 


13 2 . 0 


1320 


Ribosomal LI 
3 


Ribosomal protein L13 


3 ♦ 9e-62 


o 1 q a 


1327 


Armadillo_se 
9 


Armadillo/beta-catenin-like 
repeats 




23 . 4 


1328 " 


KRAB * - ' 


KRAB box 


0 . 052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bcl-2 


Bcl-2 family 




■ — ; — 2 


1331 


PX 


PX domain 




48.0 


1333 


KRAB 


KRAB box 


1 , 8e-36 


1J4 . b 


1334 


upp_synthata 
ee 


Putative undecaprenyl 
diphosphate synt 


2 . 3e-89 


Tin i 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1 . 8e-59 


211 . 0 


1336 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


1 . 2e-31 


118.6 


1337 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2 , 3e-12 


54 . 5 


1338 


TPR 


TPR Domain 


0.00021 


28.1 


1340 


metal thio 


Metal lothionein 


0.013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5.8e-09 


36.5 


1343 


Band 41 


PERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Kelch 


Kelch motif 


1 .4e-44 


161.5 


1345 


Antifreeze 


Antifreeze protein 


1 .2e-l0 


48 . 8 


1347 


3Beta_HST> 


3 -beta hydroxysteroid 
dehydrogena se/ i some r a 


0 .086 


-177.2 


1348 


BTB 


BTB/P0Z domain 


5 .3e-28 


106.5 


1349 


DUF6 


•Integral membrane protein DUF6 


0.033 


15 . 8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.1 


1352 


Nramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box heKcase 


3.6e-65 


209.0 


1356 


C2 


C2 domain r 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203.1 


1360 


ZE-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 ] 


1361 


HMG14 17 


HMG14 and HKG17 


7.9e-40 


145.7 | 
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SEQ ID 
NO: 


PFAM NAME 


" DESCRIPTION — ~ 


p- value 


PFAM 
SCORE 


1362 


SIS 


SIS domain 


3 .8e-30 


113 . 6 


1363 


SIS 


SIS domain 


1.3e-28 


108 .5 


1364 


ig 


Immunoglobulin domain 


0.00026 


19.0 


1368 


Ktetra 


K+ channel tetramerisation 
domain 


l.le-16 


68.9 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.2e-113 


390.1 


1372 


DnaJ 


DnaJ domain 


6.6e-36 


132 .7 


1376 


KRAB 


KRAB box 


2.1e-38 


141.0 


1378 


ELM2 


ELM 2 domain 


2e-23 


""91.3 


1380 


thiored 


Thioredoxin 


1.2e-23 


82 .8 


1381 


ank 


Ank repeat 


2.3e-83 


290.4 


1382 


BTB 


BTB/POZ domain 


3e-ll 


50.8 


13B3 


WD4 0 


WD domain, G-beta repeat 


1.6e-19 


78 .3 


1384 


WD40 


WD domain, G-beta repeat 


6.3e-24 


92.9 


1387 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-09 


35.6 


1389 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-50 


179.5 


1390 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-85 


296.9 


1393 


kinesin 


Kinesin motor domain 


7.8e-188 


637 .4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5.1e-22 


8 6". £ 


1402 


bZIP 


bZIP transcription factor 


0.03S 


13 . 1 


1405 


sugar_tr 


Sugar (and other) transporter 


0.003 


-101 . 5 


1406 


RhoGAP 


RhoGAP domain 


8 . 9e-47 


168 . 8 


1407 


rrm 


RNA recognition motif. 


le-35 


132.1 


1408' 


LRR 


Leucine Rich Repeat 


2.1e-l3"' " 


56.0 


1409 


Nebulin repe 
at 


Nebulin repeat 


6e-54 


192 . 6 


1410 


ank 


Ank repeat 


1 .6e-17 


71. 6 1 - 


1412 


Ribosomal L5 
C 


ribosomal L5P family C-terminus 


8 .2e-58 


205.5 


1415 


trypsin 


Trypsin 


4 . 7e-35 


270.4 


1416 


aminotran 1 


Aminotransferases class-I 


4 .4e-05 


1 -91.2 


1417 


SI 


Si RNA binding domain 


1 . 6e-C7 


33.1 


1419 


WD4 0 


WD domain, G-beta repeat 


2 .2e-09 


44 .6 


1422 


cadherin 


Cadherin domain 


"8.3e-42 


152 .3 


1424 


SH3 


SH3 domain 


2 . 5e-80 


280 . 3 


1425 


PHD 


PHD- finger 


3 .2e-17 


70 . 6 


1426 


PHD 


phd- finger 


3 .2e-17 


70 .6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


"138.8 


1428 


helicase_C 


Helicases conserved C- terminal 
domain 


le-26 


102 . 2 


1429 


WD40 


WD domain, G-beta repeat 


3.9e-07 


37.2 


1430 


inositol_P 


Inositol monophosphatase family 


2.5e-ld 


40.2 


1431 


mi to carr 


Mitochondrial carrier proteins 


4.3e-83 


287.7 


1433 


Clq 


Clq domain 


2.9e-16 " 


66.2 - 


1434 


WD40 


WD domain, G-beta repeat 


1.6e-13 


58.3 


1435 


inos-i- 
P_synth 


Myo-inositol-l -phosphate 
synthase 


7e-228 


770.4 


1436 


rrm 


RNA recognition motif. 


1.4e-34 


128.3 


1438 


ig 


Immunoglobulin domain 


1.3e-12 


45.6 


1440 


G_Adapt_CT 


Gamma -adapt in, C-terminus 


3 .4e-67 


236.7 


1441 

. ; 


G__Adapt_CT 


Gamma-adaptin, C-terminus 


3.4e-67 


236.7 


1443 


Kelch 


Kelch motif 


0.00013 


28.7 


1446 


ARID 


ARID DNA binding domain 


l.Be-21 


84.7 


1447 


zf -C2H2 


zinc finger, C2H2 type 


9.4e-28 


105.6 


1448 


AMP-binding 


AMP-binding enzyme ™ t ~ 


2 .6e-07 


-145.1 


1451 


rrm r 


RNA recognition motif. 


6 .Se-21 


82 .9 


1454 


ig 


Immunoglobulin domain 


5 .6e-44 


146.7 


1455 


Sialyltransf 


Sialyltransferase family 


5.4e-21 


83.2 


1460 


Aldose_epim 


Aldose l-epimerase 


1.9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .6 


1470 


TIG 


IPT^TIG domain 


3 .le-19 


77.3 


1472 


PseudoU_synt 


flNA pseudouridylate synthase 


4.3e-16 


56.9 
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SEQ ID 

' NO: 


PKAM NAME 


DESCRIPTION 


I p-value 


PFAM 
SCORE 












1474 


DKNN 


DENN (AEX-3) domain 


J 1.3e-44 


161 .6 


1475 


Cation_ef flu 

X 


Cation efflux family 


4.6e-49 


176 . 4 


1477 


TBC 


TBC domain 


8e-47 


169 . 0 


1478 


rrm 


RNA recognition motif. ' 


[ 2e-21 


84.6 


1480 


ig 


Immunoglobulin domain 


| 5.5e-06 


24 .3 


1484 


TeloJbind_al 
pha 


Telomere -binding protein alpha 
subuni 


0.026 


-225.9 


1485 


Zf-C2H2 


Zinc finger, C2H2 type 


1 l.Be-68 


240.9 


1486 


pkinase 


Eukaryotic protein Kinase 
domain 


9.5e-13 




1488 


helicase_C 


Helicases conserved C- terminal 
domain 


1.4e-15 


' 65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydra tase/isomerase 
family 


5.2e-41 


1 A Q 1 
*l3 ■ / 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5.9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


1 3.4e-19 


77.2 


149S 


2f-C3HC4 


Zinc finger, C3HC4 type {RING 
finger) 


I 7 , Xe-10 


JO • J 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


I le-22 


85 8 


1500 


SH3 


SH3 domain 


9.3e-05 


2 7.2 


"1502 


home o box 


Homeobox domain 


0 . 084 


13 . 8 


1503 


homeobox 


Homeobox domain 


1 0. 084 


13 .8 


1505 


EOF 


EGF-like domain " — 


2 , 7e-23 


90 . 8 


1506 


"UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


2 . 7e-21 


84 . 2 


1508 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2 . 8e-28 




1511 


PX 


PX domain 


1.9e-ll 


51 . 5 


1512 


Sulfatase 


sultatase 


2 . 8e-35 


130.7 


1516 


Syntaxin 


Syntaxin 


0 . 011 


-62 .3 


1518 


aminotran__3 


Aminotransferaoeo class- III 
pyridoxal-pho 


9.7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


0.075 


11.0 


1521 


RA 


Ras association i RalGDS/AF-6 ) 
domain i 


0.013 


13.3 


1523 


RhoGAP 


RhoGAP domain 


2 ,5e-05 


10 . 7 


1528 


WD40 


WD domain, G-beta repeat 1 


5.4e-24 


93.1 ' 


1535 


IMS 


impB/mucB/samB ramily 


7 . 8e-95 


328.5 


1538 


FYVE 


FYVE zinc finger ~ f 


3 . 2e-27 


101.5 


1539 


DAGKc 


Diacylglycerol kinase catalytic | 
domain | 


6e-07 


36.5 


1540 


Ocular_alb 


Ocular albinism type 1 protein | 


0 


1184 . 7 


1653 


SAP 


SAP domain | 


6e-06 


33 . 2 


1654 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157 . 0 


1655 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157 . 0 


1656 


RhoGEF 


RhoGEF domain | 


1.4e-24 


95.1 


1657 


MMR HSR1 


GTPase of unknown function I 


0.0011 


-45.5 


1659 " ■ 


UCHP2 


Ubiquitin carboxyl-terminal [ 
hydrolase family | 


2.5e-ll 


51.1" 


1660 




Actm I 


6.6e-21 


69 .9 


1661 


BAH 


BAH domain \~ 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain | 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat | 


1.4e-67 


237.9 


1667 


z£-C2H2 


Zinc finger, C2H2 type • | 


1.3e-93 


324 .4 


1669 


Noll_Nop2_Su 

n 


NUbl/NOP2/'sun family 


1.3e-23 


84.3 


1671 


SH2 


arc homology domain 2 f 


5.4e-l5 


46.9 
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SEQ ID 
NO: 


P>AM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1672 


chromo 


• chromo • ( CHRroma tin 
Organization Modifier) 


2.ie-18 


67.7 " 


Ad /4 


zE-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


17.6 


xo / o 


47 


Glycosyl hydrolase family 47 


l-8e-187 


636.2 


1 677 


Glyco hydro 
47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


ViUH U 


WD domain, G-beta repeat 


l.le-27 


105.5 


1661 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


-LOO J 


MMR HSRl ~~~ 


GTPase of unknown function 


1.8e-78 


274 .1 


1691 


rrm 


RNA recognition motif. 


1.8e-37 


137.9 1 


1692 


rrm 


RNA recognition motif. 


1.8e-37 


137 . 9 | 


16S3 


AAA 


ATPases associated with various 
cellular act 


1.3e-81 


284.5 


1697 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8.4e-82 


285.2 


1CQQ 

lost) 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3.5e-53 


190.1 "j 




zf -C2H2 


Zinc finger, C2H2 type 


4.4e-34 


126.6 i 


1700 


art 


ADP-ribosylation factor family 


9e-i9 


75.8 j 


1702 


GTP_EFTu* 


Elongation factor Tu family 


0.014 


11.4 "J 


1703 


SCAN 


SCAN domain 


1.8e-54 


194.4 j 


1707 


pkinaoe 


Eukaryotic protein kinase 
domain 


1.2e-88 


307.9 j 


1709 


WD4 0 


WD domain, G-beta repeat 


0.0035 


24 . 0 j 


1710 


LRR 


leucine Rich Repeat 


l-2e-30 


115.3 [ 


1711 


WW 


WW domain 


7.6e-12 


52.8 | 


"1712 


ank 


An.< repeat 


4.2e-34 


126.7 | 


1713 


zf-CCCH 


Zinc finger C-x8-c-x5-C-x3-H 
type 


2.6e-09 


38.3 


1714 


zt-CCCH 


Zxnc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


3B.3 


1715 


ras 


Ras family 


4.4e-41 


149.9 


1718 


HMG_box 


HMG (high mobility group) box 


8.3e-21 


82.6 | 


1719 


TBC 


TBC domain 


l.le-45 


165.2 \ 


1721 


KLH 


Helix-loop-helix DNA-binding 
domain 


9.2e-10 


45.9 | 


1723 


aarm 


Double- stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 

T 7 nr 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0 .045 


9.2 ' j 


<i • 7 - 3c 


CIDE-N 


CIDE-N domain 


5.9e-40 


146.2 j 


1/25 

V70Q 


HAT 


HAT (Half -A-TPR) repeats 


2.9e-44 


160.5 | 


X f da 


ef hand 


EF hand ~ 


5.1e-20 


79.9 


1733 

17 ->cr 


Hist_deacety 

± 


Histone deacetylase ramily 


1.7e-104 


360.6 ■ 




JbRR 


Leucine Rich Repeat 


4 .6e-34 


126.6 ) 


' 1*739 


PI-PliC-X 


Phosphatidyl inositol- specific " 
phospholipase 


0.0023 


16.1 


1743 




Ras family 


3.7e-10 


-21.3 f 


1744 


"r¥I 


Ras family 


3.7e-10 


-21.3 j 


174S 




RasGEF domain 


3.2e-49 


176.9 


1746 




short chain dehydrogenase 


7.1e-0B 


34.6 j 


17S1 




zinc finger, C2H2 type 


9e-39 


142.2 | 


1754 




Fibronectin type III domain 


5.5e-101 


348.9 ( 


1756 


zf-C2H2 


Zinc finger, C2H2 type 


6.3e-93 


322.1 j 


1758 


rrm 




0.017 


21.2 j 


1760 


Nop 


Putative snoRNA binding domain 


6,ie-95 


328.8 j 


1761 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1765 


MMR HSRl 


GTPase of unknown function i 


6.4e-41 


149.4 


1769 


CN_hydrolase 


Carbon- nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


ftnk repeat 


4.1e-07 


37.1 1 


1779 


□xysteroijBP 


□xysterol -binding protein 


4.7e-S6 


199.6 


1783 1 
1784 


RhoGEF 
RhoGEF 


RhoGEF domain 
RhoGEF domain 


1.6e-23 
1.6e-23 


91.6 

91.6 | 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS: 1 4 1 6227. 1 (%CRN0 1 !.DOQ 
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TABLE 5 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1 


1-21 


0.991 


0.955 


2 


1-31 


0.995 


0.944 


3 

~"/T~ 


1-33 


0.949 


0.736 


4 


1-19 


0.970 


0.951 


""c ~ 


1-26 


0.971 


0.863 


6 


1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0.971 


0.863 


9 


1-46 


0.982 


0.901 


10 


1-21 


0 . 991 


0.955 


11 


1-23 


0.989 


0.899 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0.938 


0.876 


15 


1-25 


0.941 


0.811 


16 


1-17 


0.972 


0.939 


17 


1-27 


0.964 


0.777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0.953 


0.840 


20 


1-20 


0.935 


0.701 


21 


1-22 


0.974 


0.850 


22 


1-33 


0.9C1 


0.895 


23 


1-19 


0.991 


0.959 


24 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


26 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


O.906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0.909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.736 


46 


1-19 


0.570 


0.951 


67 


1-25 


0.258 


0.848 


71 


1-18 


0.949 


0. B45 


72 


1-30 


0.991 


0.919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0.986 


0.945 


94 


1-33 


0. 994 


0.943 


97 


1-46 


0.964 


0.595 


103 


1-49 


0.983 


0.570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.989 


0.899 


126 


1-25 T 


0.955 


0.803 


129 


1-19 


0.963 


0.918 


138 


1-29 T 


0.971 


0 . 844 


143 


1-18 


0.914 


0.628 


148 

"TFc 


1-20 


0.969 


0.904 


lob 


1-25 


0.941 


0.811 


lag 


1-22 


0.979 


0.927 


160 


1-17 


n qhH 


0.939 


161 


1-48 


0.903 


07571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0.825 


180 


1-27 


0.981 


0.941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0.9*3 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


KaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


199 


1-20 


0.935 


0.701 


200 


1-23 


0.977 


0.773 


206 • 


1-30 


0.984 


0.890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0.974 


0.850 


210 


1-40 


0.940 


0.670 


211 


1-28 


0.971 


0.849 


216 


1-24 


0.986 


0.956 


218 


1-33 


0.961 


0.895 


219 


1-19 


0.970 


0.871 


221 


1-19 


0.904 


0.553 


222 


1-21 


0.917 


0.555 


230 


1-19 


0.991 


0.959 


231 


1-26 


0.953 


0.800 


232 


1-25 


0.988 


0.826 


239 


1-23 


0.969 


0.828 


240 


1-17 


0.982 


0.955 


241 


1-17 


0.982 


0.955 


245 


1-30 


0.970 


0 .722 


248 


1-22 


0.976 ~ 


0.935 


249 


1-23 


0.968 


0.94 0 


"252 


1-18 


0.971 


0 .923 


261 


1-24 


0.883 


0 .587 


265 


1-18 


0.939 


0.868 


272 


1-24 


0.953 


0.739 


283 


1-21 


0.906" 


0 .688 


284 


1-29 


0.997 


0 .854 


290 


1-31 


0.986 


0.841 


302 


1-28 


0.980 


0 .893 


304 


1-16 


0.907 


0.635 


312 


1-19 


0.993 


0.976 


313 


1-17 


0.930 


0.753 


323 


1-22 


0.998 


0.909 


324 


1-17 


0.982 


0.954 


328 


1-19 


0.971 


0.865 


329 


1-22 


0.963 


0.924 


330 


1-33 


0.978 


0.841 


331 


1-24 


0.920 


0.712 


332 


1-24 


0.975 


0 .881 


333 


.1-19 


0.964 


0 .941 


334 


1-20 


0.899 


0.567 


335 


1-27 


0.942 


0.813 


336 


1-20 


0.952 


0.850 


337 


1-38 


0.942 i 


0.653 


338 


1-27 


0.973 


0.772 


339 


1-36 


0.979 


0.804 


340 


1-27 


0.888 


0.597 


343 


1-19 


0.971 


0.865 


344 


1-22 


0.994 


0.928 | 


345 


1-17 


0.966 


0.687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0.963 


0.924 


349 


1-24 


0.982 


0.966 


351 


1-21 


0.918 


0.815 


352 


1-31 


0.988 


6 .912 


354 


1-31 


0.974 


0.839 


idc - 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.827 


Jbl 


1-25 


0.954 


0.674 


362" 


1-22 


0.929 


0 .788 


^363 


1-21 


0.881 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 


0.978 


0.841 
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SBQ ID NO: 


POSITION OP 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 




1-21 


0 .916 


0 .920 


367 


1-19 


0 .936 


0.822 


JOB 


1 -29 


0.972 


0.874 


■5 *U 


1-24 


0.920 


.0.712 


J /I 


1-24 


0.961 


0.11$ 




1-27 


0.919 


0.768 


J /J 


1-19 


0.986 


0.945 


J / o 


1-32 


0.994 


0.932 


J /O 


1-34 


0 .987 


0.810 


All 


1-17 


0.995 


0.950 


j /o 


1-49 


0.971 


0.749 


380 


1-20 


0.968 


0.874 


381 


1-20 


0.928 


0.782 ~ 


JO« 


1-19 


0.986 


0. 934 


J o>> 


1-28 


0.965 


0.829 




1-39 


0.970 


0.551 


Job 


1-2*4 


0.975 


0.881 


388 


1-30 


0.989 


0.868 


389 


1-19 


0.984 


0.941 


390 


1-26 


0.971 


0.782 


392 


1-20 


0.9B1 


0.900 


393 


1-16 


0.968 


0 .890 


394 


1-23 


0.937 


0.701 


397 


1-22 


0.985 


0.854 


393 


1-46 


0.977 


0.698 


401 


1-20 


0.899 


0.567 


402 


1-22 


0.967 


0.931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0 . 994 


0.921 


407 


1-35 


0.987 


0.658 


408 


1-39 


0.976 


0.551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0.990 


0.962 


411 


1-38 


0.977 


0.827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0.988 


0.965 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.981 


0.94 0 


41 / 


1-29 


0.941 


0.672 


418 


1-20 


0.952 


O.8S0 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0.889 


^0.7dS 




1-4 8 


0.982 


0.862 




1-19 


0.979 


0.933 


42B 


1-3 8 


0.942 


0.653 


430 r 


1-18 


0.947 


0.595 


A 1 "> 


1-33 


0.957 


0.789 


433 


1- 2 6 


0.979 


0.904 




1-27 


0. 962 


0.777 


435 

— 


1-24 


0.998 


0.9^7 




1-27 


0.973 


0.772 


443 * 


1-15 


0.966 


0.940 


448 


1-36 


0 .979 


0.804 




1-41 


0.958 


0.609 


4S5 


1-33 


0.943 


0 . 606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 0.930 


0.593 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


511 


1-23 


0.930 


0.593 


512 


1-23 


0.930 " 


0.593 


515 


1-18 


0.978 


0.956 


523 


1-19 


0.936 


0.822 


529 


1-22 


0-963 


0.924 


"Ten ~ 


1-24 


0.982 


0.966 




1-30 


0.933 


0.713 j 




1-21 


0.973 


0.912 




1-23 


0.969 


0.784 


O / J. 


1-21 


0 .91B 


0.815 


-Ft7 — 


1-31 


0.988 


0.912 


580 


1-39 


0.S25 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0.932 


0.632 


610 


1-21 


0.990 


0.948 


621 


1-15 


0.994 


0.969 j 


623 


1-33 


0.935 


0.726 


653 


1-27 


0.938 


0 .827 


668 


1-22 


0.929 


0.788 


677 


1-16 


0.94 8 


0.807 "| 


685 


1-21 


0.881 


0.715 


699 


1-22 


6.975^ 


0.816 


702 


1-31 


0.968 


0.898 


707 


1-16 


0.860 


0.562 


713 


1-25 


0.966 


0.743 


718 


1-19 


0.936 


0.822 


719 


1-20 


0.961 


0.824 


729 


1-29 


0.972 


0.874 


735 


1-46 


0 .903 


0.598 ] 


746 


1-14 


0.916 


0.73 0 


747 


1-22 


0.965 


6.876 | 


748 


1-29 


0.968 


0 . 785 


759 


1-24 


0 .961 


0.773 


767 


1-27 


0 .919 


0 .768 


76 8 


1-33 


0.900 


6.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


» . 900 


0.568 


820 


1-17 


0.99S 


0.950 


827 


1-49 


0.971 | 


0.749 


848 


1-20 


0.968 


0.874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


0.934 


873 


1 T 23 


0.948 


0 .886 


881 


1-28 


0.965 


0.829 


887 


1-39 


6.970 


"0.551 


927 


1-30 


0.989 


0.868 


934 


1-48 


0.988 


6 7777 


939 


1-39 


0.994 


0.889 


944 


1-26 


0.971 


0.782 


q cn 


1-29 


0.957 


"0 . 845 


70J 


1-20 


0.981 


0. 900 




1-20 


0.886 


0.558 


973 


1-16 


0.968 


0.890 


980 


1-34 ' 


0.961 


0 . 749 


981 


1-20 


0 . 953 


0.B22 


984 


1-12 


0.938 " 


0.780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0.969 


0.842 


ri059 


1-20 


0.927 


0.867 


1065 


1-33 


0 . 983 


0.918 


1069 


1-22 


0.993 


0.935 
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POSITION OF 
SIGNAL IN AMINO 


MaxS (MAXIMUM 
SCORE) 


MeanS {MEAN 
SCORE) 


1075 


X - A 1 


0 .992 


0.934 


1080 


1-19 


0 .931 


0.829 


1092 




0 .991 


0.973 


1094 


1-46 


0 . 992 


0.653 


1095 


1-30 


0 . 974 


0.929 


1105 


1-23 


0 . 994 


0.921 


1123 


**J5 


0 .987 


0.658 


1138 


"™l-32 


0 . 954 


0.613 


11-4 0 


1-3 9 


0 .989 


0 .789 


1142 


JL _ J J 


0 .897 


0.570 


1152 


J. - 9 


0 .990 


0.962 


1170 


A*JB 


0.977 


0.827 


1176 


1"*U 


0. 944 


0.768 


1187 


1-20 


0 .988 


0.965 


1189 


1-J3 


0.967 


0.839 


1192 


A - 1 b 


0.993 


0.638 


1193 


1-16 


0.925 


0.710 


1197 


1-29 


0 .985 


0.853 


1208 


1-23 


0.981 


0.940 


1^25 


1-29 


0.941 


0.672 


1245 


1-19 


0.9B6 


0.967 


1258 


1-29 


0.965 


0.861 


1265 


1-22 


0.889 


0.785 




1-20 


0.944 


0.809 




1-48 


0.962 


0.862 




1-19 


0.979 


0.933 




1-21 


0 .984 


0.944 




1-19 


0.984 


0.953 




1-38 


0.942 


0.653 




1-18 


0.947 


0.595 


1<3 /X 


1-33 


0.957 


0.789 


lJOU 


1-26 


0.979 


0.904 


IJ f / 

1-1QQ 


1-27 


0.962 


0.777 




1-23 


0.997 


0.960 


X* U^l 


1-24 


0.998 


0.977 


i d i n 
x*» i u 


1-15 


6. $46" 


0.845 


1414 


1-24 


0.913 


0.588 




1-19 


0.982 


0.929 


1416 


1-12 


0.931 


0.891 


1418 


1-30 


0.933 


6.563 


1420 


1-20 


0.881 


0.561 


1421 


1-19 


0.990 


0.96B 


1423 


1-17 


0.968 


0.863 


1424 


1-21 


0. 885 


0.591 


1425 


1 — 24 


0. 913 


0.588 


1426 


1-24 


0.913 


0.588 


1428 


J.- 


0 . 957 


0.899 


1430 


1-34 


0 .977 


0.819 


1431 


1 -28 


0 . 979 


0.923 


1432 


X - JO 


0 . 957 


0.6"13 


1433 




0 . 921 


0.753 


1434 




0.983 


0.621 


1435 


X - <J 


0.910 i 


0.631 


1436 


TTT5 


0 . 988 


0.868 


1437 


X - d. jS 


0.998 j 


0.980 


1442 


X *iU 


0.918 


0.753 


1448 


1-12 


0 . 931 


0.8 91 


1462 


ni — — 


0 .968 


0.888 


14 90 


1-20 


0.881 


0.561 


1518 


1-17 


0.968 


0.863 


152S 


1-21 


0.885 


0.591 


1547 


1-28 


0.974 


0.891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0.824 


1593 


1-28 "j- 


0.979 


3.923 
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SBQ ID NO: 


POSITION OP 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


1596 


1-16 


0.929 


0.709 


JloUl 


1-36 


0.957 


0.613 


1606 


1-22 


0.979 


0.831 


1607 


1-20 


0.974 


0.770 


1608 


1-32 


0.921 


0.753 


1614 


1-33 


0 . 9 & 9 " 


0.829 


1616 


1-20 


0.959 


0.669 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 "1 


0.631 


1636 


~l-33 


0.697 


0.591 


1639 


1-42 


0.988 


0.868 


164 5 


1-20 


0.927 


0.568 


164 7 


1-17 


0.923 


0 .742 


1548 


1-22 


0.998 


0.980 



TRADOCS: 1 4 1 6234. ) (%CR%01 ! .DOC) 
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TABLE 6 



SEQ ID NO: 
of full- 
length 
nucleo t ide 


SEQ ID 
NO: of 
full- 
length 
pepciae 

oam t anna 


SEQ ID NO: 
of contig 
nucleotide 
sequence 

. . . 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
cor re spond i ng 
SEQ ID NO: in 
priority 
application 


j SEQ ID 
. NO: in 
U.S. S.N. 
09/488,725 


1 




3573 


5359 


784CIP2_i 


1103 


2 


1 7QQ 


3574 


5360 


784CIP2_2 


2673 




i 1 7QQ 


357.5 


5361 


784CIP2 3 


4li7 


4 


1 7Qn 

x / y u 


3576 


5362 


784CIP2 4 


5556 


c 


1 /l/l 


3577 


5363 


784CIP2 5 


5562 


O 


1792 


3578 


5364 


784CIP2 6 


5562 


f 


1793 


3579 


5365 


784CIP2 7 


5562 


a 
© 


1794 


3580 


5366 


784CIP2 8 


5562 


Q 


1795 


3581 


5367 


784CIP2 9 


5563 


10 


1796 


3582 


5368 


784CIP2JL0 


5564 


11 


1797 


3583 


5369 


784CIP2 11 


5565 


Xa 


1798 


3584 


5370 


784CIP2 12 


5689 1 


13 


1799 


3585 


5371 


784CIP2 13 


5729 


14 


1800 


3586 


5372 


784CIP2 14 


574S 


15 


1801 


3587 


5373 


. 784CIP2_15 


5777 


1 £ 

16 


1802 


3588 


5374 


784CIP2_16 


5777 


17 


1803 


3589 


5375 


784CIP2 17 


5789 


18 


1804 


3590 


5376 


784CIP2 18 


5792 


19 


1805 


3591 


5377 


784CIP2_19 


5804 


20 


1806 


3592 


5378 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 


22 


180B 


3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2_23 


5844 


24 


1810 


3596 


5382 


784CIP2 24 


5850 


25 


1811 


. 3597 


5383 


784CIP2 25 


5867 


26 


1812 


3598 


5364 


784CIP2_26 


5973 


27 


1813 


3599 


5385 


784CIP2 27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 




1815 


3601 


5387 


784CIP2_29 


6005 


3 0 


1815 


3602 


5388 


784CIP2 30 


6007 


31 


1817 ' 


3603 


5389 


7B4CIP2_31 


6007 • 


i •> 
J4 


1818 


3604 


5390 


784CIP2 32 


6009 


"l -a 
J 3 


1819 


3605 


53S1 


784CIP2_33 


6012 


34 


1820 


3606 


5392 


7B4CIP2_34 


6015 


■jc 


1821 


3607 


5393 


704CIP2 35 


6016 


J o 


1822 


360B 


5394 


784CIP2_36 


6016 


■J 7 


1823 


3609 


5395 


7B4CIP2_37 


6018 


38 


1824 


3610 


5396 


784CIP2_38 


6018 


3 9 


1 floe 


3611 


5397 


784CIP2 39 


6018 


40 


1 QIC 

± 0 CO 


3612 


5398 


7B4CIP2 40 


6023 


4 1 


1827 


3 613 


5399 


784CIP2 41 ■ 


6070 


42 




3 614 


5400 


764CIP2 42 


6081 


43 


1829 


3615 


5401 


784CIP2 43 


6089 


44 


i n 1 n 


3616 


5402 


784CIP2 44 


6118 


45 


183 1 


3 617 


5403 


784CIP2 45 


6118 


46 


1832 


3 618 


5404 


784CIP2 46 


6130 


47 


1833 


3 619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2 4R 




49 


1835 


3621 


5407 


784CIP2 49 


6191 


50 


" 1836 


3622 


5408 


784CIP2 50 


4204 


51 


1837 


3623 


5409 


784CIP2 51 * 


6204 


52 


1838 


3624 


5410 


784CIP2 52 


6284 


53 . 


1839 


3625 


5411 


784CIP2 53 " 


6367 


54 


1840 


3626 


5412 


784CIP2 54 


6436 j 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2 57 


6457 


58 


1844 


3630 


5416 


784CIP2 58 


6458 


59 


" 1845 


3631 


5417 


7B4CIP2 59 


6458 
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SEQ ID NO: 
of full- 
lengtn 
nucleotide 


SEQ ID 
NO; of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


60 




3632 


5418 


| 784CIP2_60 


6462 


61 


1 84 7 


3633 


54 19 


784CIP2 61 


6472 1 


62 


184 8 


3634 


5420 


784CIP2_62 


6499 


63 


1 OA O 


T?n g 

3635 


5421 


784CIP2 63 


6499 ~~ 


64 


1850 


3636 


i 5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


784CIP2_65 


J e534 


66 


1852 . 


363 8 


5424 


7B4CIP2 66 


6534 


67 


1853 


3639 


5425 


784CIP2 67 


6540 


68 




3640 


5426 . 


784CIP2_6B 


6550 


69 


185S 


3641 


5427 


784CIP2 69 


6550 


70 


1B56 


3642 


5428 


784CIP2 70 


6592 


71 


1857 


3643 


5429 


784CIP2 71 


6645 


72 


1958 


3644 


5430 


784CIP2 72 


6671 


t5 


1359 


3645 


5431 


7B4CIP2 73 


6763 




1860 


3646 


5432 


784CIP2_74 


6763 " 




1361 


3647 


5433 


784CIP2_75 


6786 


t o 


1862 


3648 


5434 


7B4CIP2 76 


6824 


/ / 


1B63 


3649 


5435 


784CIP2 77 


6830 


fa 


1854 


3650 


5436 


784CIP2_78 


6831 


79 


1865 


3651 


5437 


784CIP2_79 


6832 


80 

D1 ' 


1866 


3652 


5438 


784CIP2_80 


6834 


»1 


1867 


3653 


543 9 


784CIP2 81 


6834 


82 

r1 


1858 


3654 


5440 


784CIP2 82 


6835 




1859 


3655 


5441 


784CIP2_83 


6837 


84 


1B70 


3656 


5442 


784C1P2 B4 


6843 


85 


1871 


3657 


5443 


784CIP2 85 


6859 


86 


1872 


3658 


544 4 


784C1P2_86 


6915 


87 


1873 


3659 


544 5 


784CIP2 87 


6932 


68 


1874 


3660 


544 6 


784CIP2 86 


6957 ~~ 


89 

on — 


1875 


3661 


5447 


784CIP2 89 


6961 


yu 


1876 


3662 


5448 


784CIP2 90 


6973 


91 


1877 


3663 


544 9 


784CIP2 91 


6973 


oo 

JZ 


1878 


3664 


5450 


784CIP2 93 


7007 


93 


1879 


3665 


5451 


784CIP2 94 


7018 




1880 


3666 


5452 


784C1P2 95 


7019 




1881 


3667 


5453 


784CIP2 96 


7020 




1882 


3668 


5454 


784CIP2 97 


7026 




■ 1883 


3669 


5455 


784CIP2 98 


7021 


98 


18 84 


3670 


5456 


784CIP2 99 


7023 


99 


1885 


3671 


5457 


784CIP2 100 


7027 


100 


1686 


3^72, 


54S8 


784CIP2_101 


7028 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


7031 


103 


1889 


3675 


5461 


784CIP2 104 


.7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


105 


i no i 


3 677 


5463 


7B4CIP2 106 


7035 


106 


1892 


3678 


5464 


784CIP2 107 


7036 "" 


lol 


1893 


3679 


5465 


784CIP2_108 


7039 


108 


1894 


3680 


5466 


784CIP2_109 


7043 


i 109 


1 895 


3681 | 


5467 


784CIP2_110 


7044 


110 


1 ftQ<£ 


3682 


5468 


784CIP2_111 


7046 


111 


.1897 


3683 


5469 


784CIP2_112 


7054 


112 


1898 


3684 


5470 


784CIP2_113 


7061 


113 


1899 


3685 


54 71 


' o**y-xv 4 114 


7077 


114 


1900 


3686" 


5472 


7B4CIP2 115 


7092 


115 


1901 


3687 


5473 


784CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2_117 


7105 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 


784CIP2 120 


7123 


120 


1906 


3692 


5478 


7B4CIP2_121 


7142 


121 


1907 


3693 


5479 j 


784CIP2 122 


7142 
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SEQ ID NO: 
of full- 
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nucl eot ide 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of con tig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


" SEQ ID 
NO:in 
U.S. S.N. 
09/488,725 


122 


1908 


3694 


5480 


784CIP2 123 


7154 


123 


1909 


3695 


5481 


784CIP2_124 


7160 


124 


1910 


3696 


I 5482 


784CIP2_125 


7169 


125 


1911 


3697 


5483 


784CIP2 126 


7185 


126 


1912 


3(>98 


5484 


784CIP2 127 


J 7197 


127 


1913 


3699 


5485 


704CIP2 128 


7219 


128 


1914 


3700 


5486 


784CIP2 129 


7226 




1915 


3701 


5487 


784CIP2 130 


7229 


13 0 


1916 


3702 


5488 


784CIP2 131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


j. j ^ 


1918 


"'' 3704 


5490 


784CIP2_133 


7235 


ljj 


1919 


3705 


5491 


7B4CIP2JL34 


7238 


J. j «l 


1920 


3706 


5492 


784CIP2 135 


7247 


-L J b 


1921 


3707 


5493 


784CIP2_136 


7261 


Ub 


1922 


3708 


5494 


784CIP2JL37 


7262 


137 


1923 


3709 


5495 


784CIP2_138 


7267 


13 8 


1924 


3710 


54 96 


784CIP2_139 


7272 


139 


1925 


3711 


5497 


784CIP2 140 


7273 


140 


1926 


3712 


5498 . 


784CIP2 141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


192H 


3714 


5500 


784CIP2 143 


7291 


143 


1929 


3715 


5501 


784CIP2 144 


7293 


144 


1930 


3716 


5502 


784CIP2 145 


7294 


14S 


1931 


3717 


5503 


784CIP2 146 


7299 


146 


1932 


3718 


5504 


784CIP2 14 7 


7300 


147 


1933 


3719 


5505 


784CIP2 148 


7312 


148 

1"75 


1934 


3720 


5506 


784CIP2_149 


7313 


14 9 

TEH 


1935 


3721 


5507 


784CIP2_150 


7315 


150 


1936. 


3722 


55C8 


I 784CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2 152 


7321 


152 


193 3 


3724 


5510 


784CIP2 153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2 155 


7333 


155 


1941 


3727 


5513 


784CIP2_156 


7350 


156 


1942 


3728 


5514 


784CIP2 157 


7352 


157 


1943 


3729 


5515 


784CIP2 158 


7384 


158 


1944 


3730 


5516 


784CIP2_159 


7403 


159 

t"c7\ " 


1945 


3731 


5517 


784CIP2_160 


7431 


160 

' i tf-i 


1946 


3732 


5518 


784CIP2 l6l 


7441 


lbl 
i ci 


1947 


3733 


5519 


784CIP2_162 


7453 


Xb^ 

i~g^ — : 


1948 


3734 


5520 


784CIP2_163 


7467 


XbJ 

TZa 


1949 


3735 


5521 


784CIP2 164 


7471 




1950 


3736 


5522 


784CIP2 1^5 


7493 


Tec 

Xbb 


1951 


3737 


5S23 


784CIP2 166 


7502 


166 


1952 


3 73 9 


5524 


784CIP2JL67 


7511 


167 


1953 


3739 


5525 


784CIP2_168 i 


7514 


168 


1954 


3740 


5526 


784CIP2_169 


7520 


169 


1955 


3741 


5527 


784CIP2 170 


7541 


1 7fl 


1956 


3742 f 


552B 


784CIP2 171 


7570 


171 


1957 


3743 i 


5529 


784CIP2 172 


7578 


Ire 


1958 


3744 


5530 


784CIP2_173 


7583 


173 


1959 


3745 


5531 


784CIP2 174 


7592 




1960 


3746 


5532 


784CIP2 175 


7601 


175 


1961 


3747 


5533 


/a4l_jtP2 176 


7602 


176 


1962 


3748 


5534 


784CIP2 177 


. 7608 


177 


1963 


3749 


5535. 


7B4CIP2 178 


7615 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3751 


5537 ] 


784CIP2_181 


7624 


180 


196-6 " - 


3752 


5538 


7B4CIP2_182 


76*26 


181 


1S67 


3753 


5539 


784CIP2 183 


7640 


182 


196B 


3754 


5540 


7B4CIP2 184 


7641 


1B3 


19^9 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SBQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docJcet number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


.SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 




1970 


3756 


5542 


784CIP2 186 


7641 


185 


1971 


3757 


5543 


784CIP2_187 


7642 


lab 


1972 


3758 


5544 


784CIP2_188 


7649 


±0 / 


1973 


3759 


5545 


784CIP2 189 


7656 


188 


1974 


3760 


5546 


784CIP2 190 


7657 


189 


1975 


3761 


5547 


784CIP2_191 


7657 


190 


1976 


3762 


5548 


784CIP2__192 


7662 


191 


1977 


3763 


5549 


784CIP2 193 


7668 


192 


1978 


3764 


5550 


784CIP2 194 


7673 


193 


1979 


3765 


5551 


784CIP2 195 


j 7690 


194 


1980 


3766 


5552 


784CIP2_196 


7700 


195 


19B1 


3767 


5553 


784CIP2 197 


7709 


196 


1982 


3768 


5554 


784CIP2_198 


7736 


197 


1983 


3769 


5555 


784CIP2_199 


7737 


198 


1984 


3770 


5556 


784CIP2_200 


7744 


199 


1985 


3771 


5557 


784CIP2 201 


7771 


200 


1986 


3772 


5558 


784CIP2_202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2 204 


7797 


203 


1989 


3775 


• 5561 


784CIP2_205 


7806 


204 


1990 


3776 


5562 


784CIP2_206 


7812 


205 


1991 


3777 


5563 


784CIP2_207 


7812 


206 


1992 


3778 


5564 


784CIP2_208 


7818 


207 


1993 


3779 


5565 


784CIP2_209 


7822 


208 


1994 


3780 


5566 


784CIP2 210 


7827 


209 


1995 


3781 


5567 


784CIP2 211 


7830 


210 


1995 


3782 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 


212 


1998 


3784 


5570 


784CIP2_215 


7858 


213 


1999 


3785 


5571 


784CIP2 216 


7858 


214 


2000 


3786 


5572 


784CIP2_217 


7861 


215 


2001 


3787 


5573 


784CIP2_218 


7866 


216 


2002 


3788 


5574 


784CIP2_219 




217. 


2003 


3789 


5575 


784CIP2_220 


7896 


218 


2004 


3790 


5576 


784CIP2 221 


7898 


219 


2005 


3791 


5577 


784CIP2 222 


7900 


220 


2006 


3792 


557 8 


784CIP2 223 


7906 - 


221 


2007 


3793 


$579 


784CIP2 224 


7908 


222 


2008 


3794 


5580 


784CIP2 225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2 227 


7932 


225 


2011 


3797 


5£83 


784CIP2_22B 


7940 


226 


2012 


3798 


5584 


784CIP2_229 


7940 


227 


2013 


3799 


5585 


784CIP2_230 


7984 


226 


2014 


3800 


5586 


784CIP2_231 


7984 


229 


2015 


3B01 


5587 


784CIP2 232 


8001 


230 


2016 


3802 


5588 


784CIP2 233 


8021 


231 


2017 


3803 


5589 


784CIP2 234 


8029 j 


232 


2018 


3804 


5590 


7B4CIP2_235 


8033 


233 


2019 


3805 


5591 


784CIP2_236 


8040 


234 


2020 


3806 


5592 


784CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3808 


5594 


784CIP2_239 


8096 


237 


2023 


3 809 


5595 


784CIP2 240 


8113 


238 


2024 


3810 


5596 


784CIP2_241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2_243 


8137 


241 


2027 


3813 


5599 


784CIP2 244 


8137 


242 


2028 


3814 


5600 


784CIP2J245 j 


8159 


243 


2029 


3815 


5501 


784CIP2_246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2 248 


8176 
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SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of con tig 

peptide 

sequence 


"T~n — : 1 — 

Priority 

corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO : in 
U.S.S .N. 
09/488, 725 


246 


2032 


3818 


5604 


784CIP2 249 


8196 


247 


2033 


3819 


5605 


784CIP2_2S0 


8200 


248 


2034 


3820 


5666 


784CIP2 251 


8212 


249 


2035 


3821 


5607 


7B4CIP2_252 


8220 


250 


2036 


3822 


5608 


784CIP2_253 


8236 


251 


2037 


3823 


5609 


784CIP2 254 


8254 


252 


2038 


3824 


5610 


784CIP2 2£& 


8255 


2$3 


2039 


3825 


5611 


784CIP2 256 


8288 


254 


2040 


3826 


5612 


784CIP2_257 


8296 


255 


2041 


3827 


5613 


784CIP2 258 


8329 


256 


2042 


3828 


5614 


784CIP2 259 


83 62 


257 


2043 


3829 


5615 


784CIP2 260 


8429 


258 


2044 


3830 


[ 5616 


784CIP2 261 


8436 


259 


2045 


3831 


5617 


784CIP2 262 


8446 


260 


2046 


3832 


5618 


784CIP2 263 


B4 72 


261 


2047 


3833 


5619 


784CIP2 264 


8502 


262 


2048 


3834 


5620 


784CIP2 265 


8504 


263 


2049 


3835 


' 5621 


784CIP2 266 


8507 


264 


2050 


3836 


5622 


784CiP2 2^8 


8509 


265 


2051 


3837 


5623 


784CIP2 26 9 


8515 


266 


2052 


3838 


5624 


784CIP2 270 


8519 


267 


2053 


3839 


5625 


784CIP2 271 




| 268 


2054 


3840 


5626 


7B4CIP2 27? 


0 D J Z, 


269 


2055 


3841 


5627 


784CtP2 273 


853 2 


270 


2056 


3842 


5628 


784CTP2 274 




271 


2057 


3843 


5629 


7A4PTP? 27*; 




272 


2058 


3844 


5630 


784CTP7 77fi 


854 3 


273 


2059 


3845 


5631 


784CIP? 577 


gen-] 


274 


2060 


3846 


5632 


7B4CIP2 27S 




275 


2061 


3847 


5633 




0 a 1 0 


276 


2062 


3848 


5634 




oozu 


277 


2063 


3849 


5635 


784CIP2 281 


fi65l 


218 


206"4 


3850 


563 6 


784CIP2 282 


862 3 


279 


2065 


3851 


5637 


784CIP2 283 


862 5 


290 


2066 


3852 


5638 


784CIP2 284 


8628 


281 


2067 


3853 


5639 


784CIP2 285 


8628 


282 


2068 


3854 1 ■ 


5640 


7B4CIP2 286 


8629 


283 


2069 


3 855 


5641 


784CIP2 287 


863 0 


284 


2070 


3856 


5642 


784CIP2 288 


8631 


285 


2071 


3857 


5643 


784CIP2 289 


8633 


286 


2072 


3858 


5644 


784CIP2 290 


8634 


287 


2073 


3859 


5645 


784CIP2__291 


8635 


288 


2074 


3860 


5546 


784CIP2 292 


8636 


269 


2075 


3861 


5647 


784CIP2 293 


8659 


290 


2076 


3862 


5648 


784CIP2 294 


8660 


291 


2077 


3863 


5649 


784CIP2 295 


8667 


292 


2078 


3864 


5650 


7S4CIP2 296 


8667 


293 


2079 


3865 


5651 


784CIP2 297 


8685 


294 


2080 


3866 


5652 


784CIP2 298 


8805 


295 


2081 


3867 


5653 


784CIP2 299 


8896 


296 


2062 


3666 


5654 


784CIP2 300 


8976 


297 


2083 


3869 


5655 


784CIP2 301 


9046 


298 


2084 


3870 


5656 


784CIP2302 


9048 


299 


2085 


3871 


5657 


784CIP2_303 


9116 


300 


2086 


3872 


5658 


784CIP2 304 


9195 


301 


2087 


3873 


5659 


7B4CIP2_305 


9201 


302 


2088 


3874 


5660 


784CIP2_306 


9307 


303 


2089 


3875 


5661 


784CIP2 307 


9321 


304 


20SO 


3876 


5662 


7B4CIP2 308 


9397 


305 


2091 


3877 


5663 


784CIP2 309 


9405 


306 


2092 


3878 


S664 


784CIP2_310 


9406 


307 


2093 


3879 


5665 


784CIP2 311 


9422 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 
nu . or 
full- 
length ' 
peptide 
sequence 


SEQ ID NO: 
of con tig 

nnr] pot i rf*» 

sequence 


SEQ ID 
NO : 

of contxg 
peptide 


Priority 
docket number_ 
corresponding 

oty iu ci\j : in 

application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488 , 725 


308 


2094 


3860 


5666 


784CIP2 312 


94 94 


309 


2095 


3881 


5667 


784CIP2 313 


9512 


310 


2096 


4882 


5668 


784CIP2 314 

f W ~* W A f~ « -J -I "J 


9632 


311 


2097 


3883 


5669 


784CIP2 315 


9661 


312 


2098 


3884 


5670 


784CIP2 316 


9664 


313 


2099 


3885 


5671 


784CTP? 15 7 


9691 


314 


2100 


3886 


5672 


"■ ^84CiP2 318 


i Q7nn 


| 315 


2101 


3887 


5673 


784CIP2 319 


/ JLo 


j 316 


2102 


3888 


5674 




Jf /2J. 


317 


2103 


3889 


5675 




9870 


318 


2104 


3890 


567J» 




9887 


319 


2105 


3891 


5677 


/o4v-JLP2 J23 


9923 


320 


2106 


3892 


5678 


fOSUr^ J 2 9 


993 8 


321 


2107 


3893 


JO / J 


/04L.J.P2 325 


9964 


322 


2108 


3894 


5680 


/C4l-J.t l .<£ J2o 


10007 


323 


2109 


3895 




/b4UlP2 327 


10009 


324 


2110 


3696 




/B4CIP2 328 


10046 


325 


2111 


3897 


jdbj 


/B4CIP2 329 


10156 


326 


2112 


3898 


56^ 84 


i /B4t_lP2_i30 


10276 


327 


2H3 


3899 


56*85 


/o4CXP2 331 


10283 


328 


2114 


3900 


jbob 


/94LIP2B 1 


152 


329 


2115 


3901 


0*7 


/H4CIP2B 2 


167 


330 


2116 


3902 


CCQQ 
DO OO 


P2B_3 


205 


331 


2117 


3903 




/B4CIP2B 4 


210 


332 


211B 


3904 




7B4CIP2B 5 


225 


333 


2119 


3 905 


CfQI 


784CIP2B 6 


226 


334 


2120 


3 906 




784CIP2B 7 


264 


335' 


2121 


3907 


CCQ1 

!>d3j 


784CIP2B 8 


268 


336 


2122 


3908 


5694 


784CIP2B 9 


293 


337 


2123 


3909 




784CIP2B 10 


293 


338 


2124 


3910 




7S4CIP2B_11 


293 


339 


2125 


3911 




/o4t_I P2B__12 


302 . 


340 1 


2126 


3912 


CCQQ ' 


7B4CIP2B 13 


311 


341 


2127 


3913 


5699 


/o4v.Jlr<io A4 


352 


342 


2128 


3914 


5700 




358 


343 


2129 


3915 


5701 


/D^LlrZD J. fa 


368 


344 


2130 


3916 


5702 


7B4rf DID 1*7 


3 93 


445 


2131 


3917 


5703 




477 


346 


2132 


3918 


5704 


7P4PTD7R TO 


508 


347 


2133 


3919 


5705 




508 


348 


2134 


3920 


5706 






349 


2135 


3921 


5707 




c*7 a 
3 / o 


350 


2136 


3922 


5708 




588 


351 


2137 


3923 


5709 


784CTP7R 


jji 


352 


2138 


3924 


5710 


7S4CIP2B 25 


593 


353 


2139 


3925 


5711 


784CIP7B 26 


594 


354 


2140 


3926 


5712 


784CIP2B 27 


619 


355 


2141 


3927 


5713 


784CIP2B 26 


620 


356 


2142 


392B 


5714 


784C1P2B 29 


6*54 


357 


2143 


3929 


5715 


784CIP2B 30 


692 


358 


2144 


3930 


5716 


784CTP2B 11 


753 


359 


2145 


3931 


5717 




/ oo 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


784CIP2B_35 


838 


363 


2149 


3935 


5721 


784CIP2B_36 


870 


364 


2150 


3936 


5722 


784CIP2B 37 


891 


365 


2151 


3937 


5723 


784CIP2B 38 


891 


36-6- 


2152 


3938 


5724 


784CIP2B__39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


2155" " 


3941 


5727 


784CIP2B 42 


942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 

- 


370 


2156 


3942 


5728 


784CIP2B_43 


958 


371 


| 2157 


3943 


5729 


784CIP2B 44 


968 


372 


2158 


3944 


5730 


784CIP2B_45" 


992 


373 

— A VJ j. 


2159 


3945 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 


2161 


3947 


5733 


784CIP2B 48 


j 1104 


376 


2162 


1 3948 


5734 


784CIP2B_4 9 


1114 


377 


2163 


; 3949 


5735 


784CIP2B_50 


1144 


378 


2164 


3950 


5736 


784CIP2B 51 " 


1262 


379 


2165 


3951 


5737 


784CIP2B_52 


I 1318 


380 


2166 


3952 


5738 


784CIP2B 53 


1319 


381 


2167 


3953 


573 9 


784CIP2B 54 


1328 


382 


2168 


l_ 3954 


5740 


784CIP2B_55 


1436 


383 


2169 


3955 


5741 


784CIP2B 56 


1464 


384 


2170 


3956 


5742 


784CIP2B 57 


1584 


385 


2171 


3957 


5743 


784CIP2B 58 


1617 


386 


2172 


3958 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


5745 


784CIP2B_60 


1728 


388 


2174 


3960 


574 6 


784CIP2B_61 


1772 


389 


2175 


3961 


5747 


784CIP2B_6'2 


1809 


390 


• 2176 


3962 


574 8 


784CIP2B 63 


1868 


391 


2177 


3963 


574 9 


784CIP2B 64 


1898 


392 


2178 


3964 


5750 


784CIP2B 65 


1926 


393 


2179 


3965 


5751 


784CIP2B 66 


1465 


394 


2180 


3966 


5752 


784CIP2B 67 


1967 


395 


2i8i 


3967 


5753 


784CIP2B 68 


1995 


396 


2182 


3968 


5754 


784CIP2B_69 


2005. 


397 


2183 


3969 


5755 


784CIP2B_70 


2027 


398 


2184 


3970 


5756 


784CIP2B 71 


20^5 


399 


2185 


3971 


5757 


784CIP2B 72 


2103 


1 400 


2185 


3972 


5758 


784CIP2B 73 


2106 


401 


2187 


3 973 


5759 


784CIP2B_74 


2166 


402 


2188 


3974 


5760 


784CIP2B_?5 


2175 


403 


2189 


3975 


^ 5 761 


784CIP2B 76 


2176 


404 


2190 


3976 


5762 


784CIP2B 78 


2236 | 


405 


2191 


3977 


5763 


784CIP2B_79 


2250 


406 


2192 


3978 


5764 


784CIP2B 60 


2300 . 


407 


2193 


3979 


5765 


784CIP2B 81 


2323 


408 


2194 


3980 


5766 


784CIP2B 82 


2340 


409 


2195 


3981 


5767 


784CIP2B 83 


23 71 


410 


2196 


3982 


5768 


784CIP2B_84 


2399 


411 


2197 


3983 


5769 


784CIP2B 8$ 


2411 


412 


2198 


39B4 


5770 


784CIP2B 86 


2428 


413 


2199 


39B5 


5771 


784CIP2B 87 


2430 


414 


2200 


39B6 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


416 | 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 

TTS " 


2203 


3989 


5775 


784CIP2B 91 


2467 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


1 419 


2205 


3991 


5777 


7d4C:fP2B 93 


2512 


420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 


423 


2209 


3995 


5781 


784CIP2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 " 


425 


2211 


3997 


5783 


784CIP2B 99 


" 2943 - 


426 


2212 


3998 


5784 


784CIP2B 100 


3137 


427 


2213 


3999 


5785 j 


784CIP2B 101 


" "3137 - 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B_103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


3360 


431 


2217 


4003 


5789 


7B4CIP2B 105 


3362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


43 2 


2218 


4004 


5790 


784CIP2B 106 


3417 




2219 


4005 


5791 


7S4CIP2B 107 


3418 


434 


2220 


4006 


5792 


784CIP2B_108 


| 3442 


43 5 


2221 


4007 


5793 


784CIP2B_109 


3442 


43 6 


2222 


4008 


5794 


784CIP2B_110 


3444 


437 


2223 


4009 


5795 


7B4CIP2B_11I 


3855 


43 8 


2224 


! 4010 


5796 


784CIP2B 112 


3863 


439 


2225 


4011 


5797 


784CIP2B 113 


4090 


440 


2226 


j 4012 


5798 


784CIP2B_114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B 115 


4142 


442 


2228 


4014 


£800 


784CIP2B 116 


4142 


443 


2229 


4015 


5801 


784CIP2B_117 


4149 


444 


2230 


4016 


5802 


784CIP2B 118 


4196 


44 5 


2231 


4017 


| 5603 


784CIP2B 119 


4202 


446 


2232 


4018 


5804 


784CIP2B 120 


4274 


44 7 
~ * a a a 


2233 


4019 


5805 


784CIP2B 121 


4304 


44 a 


2234 


4020 


5806 


784CIP2B_122 


4306 


449 


2235 


4021 


5807 


784CIP2B 123 


4311 


450 


2236 


4022 


5803 


784CIP2B 124 


4321 


451 


2237 


4023 


5809 


784CIP2B_125 


4323 


452 


2238 


4024 


5810 


784CIP2B 126 


4332 


453 


2239 


4025 


5811 


784CIP2B 127 


4488 


454 


2240 


4026 


5812 


784CIP2B_128 


4588 


455 


2241 


4027 


5813 


784CIP2B_129 


5569 


456 


2242 


4028 


5814 


784CIP2B 130 


5573 


457 


2243 


4029 


5815 


784CIP2B 131 


5577 


458 


2244 


4030 


5816 


784CIP2B_132 


5579 


459 


2245 


4031 


5817 


7B4CIP2B__133 


5S82 


460 


2246 


4032 


5818 


784CIP2B 134 


5583 


461 


2247 


4033 


5819 


784CIP2B 135 


5584 


462 


2248 


4034 


5820 


784CIP2B 136 


5585 


463 


2249 


4035 


5821 


784CIP2B 137 


5591 


464 


2250 


4036 


5822 


784CIP2B_138 


5593 


465 


2251 


4037 


5823 


784CIP2B 139 


5594 


466 


2252 


4038 


5824 


784CIP2B 140 


5594 


467 


2253 


4039 


5825 


784CIP2B 141 


5598 


468 


2254 


4040 


5826 


784CIP2BJL42 


5602 


469 


2255 


4041 


5827 


784CIP2B 143 


5605 1 


470 


2256 


4042 


5828 


784C*P2B_144 


5608 


471 


2257 


4043 


5829 


784CIP2B_145 


5617 


472 


2258 


4044 


5830 


784CIP2B 146 


5620 ■ 


! 473 


2259 


4045 


5831 


784CIP2B_147 


5622 


474 


2260 


4046 


5832 


784CIP2B 148 


" 5623 ■ " 


475 


2261 


4047 


5833 


7B4CIP2B_149 


5624 


476 


2262 


4048 


5834 


784CIP2B 150 


5625 




2263 


4049 


5835 


784CIP2B_151 


5627 


478 


2264 


4050 


5836 


784CIP2B 152 


5628 


1 lit 


2265 


4051 


5837 


784CIP2B_153 


5630 


480 


2266 


4052 


5838 


784CIP2B 154 


5632 


481 


2267 


4053 


5839 


7B4CIP2B 155 


5640 


482 


2268 


4054 


5840 


7B4CIP2B 156 


5641 


483 


2269 


4055 


5841 


784CIP2B 157 


5643 


484 


2270 


4056 


5842 


784CIP2B_158 


5647 | 


4 85 


2271 




5843 


784CIP2B 159 


5649 


486 


2272 


4058 


5844 


784CIP2B_l£6 


5658 


487 


2273 


4059 


5645 


784CIP2B 161 


5659 


488 


2274 


4060 


5846 


784CIP2B_162 


5667 


489 


2275 


4061 


5847 


784CIP2B_163 


5672 


490 


2276 


4062 


5848 


784CIP2B_164 


5674 


.491 


2277 


4063 


5849 


7B4CIP2B 165 


" 5678 


492 


2278 


4064 


5850 


784CIP2B 166 


5680 


493 


2279 


4065 


5851 


784CIP2B 167 


5684 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


' Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


" SEQ ID 
NO:in 
U.S. S.N. 
09/488.725 


A Q A 


2280 


4066 


5852 


784CIP2B_168 


5686 


1 33 


• 2261 


4067 


5853 


784CIP2B 169 


5694 


4 QIC 


2282 


4068 


5854 


784CIP2B 170 


5698 


4 97 


2283 


4069 


5855 


784CIP2B 171 


5699 


/l O Q 


2284 


4070 


5856 


784CIP2B 172 


5712 


^ a a 


2265 


4071 


5857 


i 784CIP2B 173 


5719 




2266 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B 175 


5727 




2288 


4074 


5860 


7S4CIP2B_176 


5730 


503 


2289 


4075 


5861 


| 784CIP2B_177 


5734 


504 


2290 


4076 


5862 


784CIP2B 178 


5738 


505 


2291 


4077 


5863 


7B4CIP2B 179 


5739 


"Erie 
506 


2292 


4078 


5864 


784CIP23 180 


! 5740 


507 


2293 


4079 


5665 


784CIP2B 181 


5744 


! 508 


2294 


4080 


5666 


784CIP2B_182 


5748 


509 


2295 


4081 


5867 


784CIP2BJL83 


5749 


510 


2296 


4062 


5868 


7B4CIP2B 184 


5750 


511 


2297 


4083 


5869 


7B4CIP2B 185 


5750 


512 


2298 


4084 


5870 


7B4CIP2B_186 


5750 


513 


2299 


4085 


5B71 


784CIP2B 187 


5761 


514 


2300 


4086 


5872 


784CIP2B 188 


5762 


515 


2301 


4087 


5873 


784CIP2B 189 


5767 


516 


2302 


4088 


5874 


784CIP2B 190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


784CIP2B 192 


5784 


519 


2305 


4091 


5877 


784CIP2B 193 


5788 


520 


2306 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B w 196 


5807 


522 


2308 


4094 


5880 


784CIP2B 197 


5818 


523 


2309 


4095 


5881 


784CIP2B_198 


5819 


524 


2310 


4096 


5882 


784CIP2BJL99 


5827 j 


525 


2311 


4097 


5883 


7B4CIP2B 200 


5828 


526 


2312 


4098 


5884 


784C*P2B_20i 


5842 


527 


2313 


4099 


5885 


784CIP2B_202 


5853 


528 


2314 


4100 


5886 


784CIP2B 203 


5861 


529 


2315 


4101 


5887 


784CIP2B_204 


5864 


530 


2316 


4102 


5868 


784CIP2B_20S 


5865 


531 


2317 


4103 


5889 J 


784CIP2B 206 


S871 


532 


2318 


4104 


5890 


784CIP2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B_208 


5873 


534 

eic 


2320 


4106 


5892 


784CIP2B_209 


5875 


535 

__ 


2321 


4107 | 


5893 


784CIP2B_210 


5878 


536 


2322 


4108 


5894 


784CIP2B_211 


5879 


537 

cT5 


2323 


4109 


5895 


784CIP2B 212 


5880 


538 

?To 


2324 


4110 


5896 


784CIP2B_213 


5880 


539 


2325 [ 


4111 


5897 


7S4CIP2B 214 


5880 


"cSn" 


23 26 


4112 


5898 


7 84CIP2B_215 


5880 


C A 1 


2327 


4113 r 


5899 


784CIP2B 216 


5885 


CAT 
biz 


2328 


4114 


5900 


784CIP2B 217 


5895 


543 

C A A ~ ' 


2329 


4115 j 


5901 


784CIP2B 218 


5898 


544 


2330 


4116 


5902 j 


784CIP2B 219 


5902 


545 


2331 


4117 


5903 


784CIP2B_220 


5904 


546 


2332 r 


4118 


5904 


784CIP2B_221 


5918 


547 


2333 




5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


! 551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B 227 


5946 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; of 
full- 
length 
peptide 
sequence 


CFO ID NO • 

of con tig 

nucleotide 

sequence 


NO: 

of contlg 

peptide 

sequence 


Priority 

UVvaCL 1111 ll,K r 

rorrpRnriTiHi no 
SEQ ID NO: in 
priori ty 
application 


SEQ ID 
wu : in 
U S S N 
09/488 725 


556 


2342 


4128 


5914 


7B4CIP2B_232 


5975 


557 


2343 


4129 


5915 


784CIP2B_233 


5977 


558 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B_235 


5979 


560 


2346 


4132 


5918 


784CIP2B_236 


5980 


561 


2347 


j 4133 


5919 


784CIP2B_237 


5988 


562 


2348 


4134 


5920 


784CIP2B 238 


5989 


563 


2349 


| 4135 


5921 


784CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B 240 


5997 


565 


2351 


4137 


5923 


784CIF2B 241 


599B 


566 


2352 


4138 


5924 


784CIP2B 242 


6003 


567 


2353 


4139 


5925 


784CIP2B 243 


6004 


568 


2354 


4140 


5926 


784CIP2B 244 


6013 


j 569 


2355 


4141 


5927 


764CIP2B_245 


6028 


570 


2356 


4142 


5928 


784CIP2B 246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIP2B 248 


6031 


573 


2359 


4145 


5931 


784CIP2B 249 


603 1 


j 574 


2360 


4146 


"" 5932 


784CIP2B 250 


6032 


575 


2361 


4147 


5933 


784CIP2B 251 


6037 


576 


2362 


4148 


5934 


784CIP2B 252 


6037 


577 


2363 


4149 


5935 


784CIP2B 253 


6043 


£78 


2364 


4150 


5936 




6044 


579 


2365 


4151 


5937 




6046 


580 


2366 


4152 


5938 


784CIP2B 256 

r U 1 ! vJ>TaD 


604 B 


! 581 


2367 


4153 


5939 




604 9 


582 


2368 


4154 


5940 


784CIP2B 258 


£6£i ' " 


583 


2369 


4155 


5941 


784CIP2B 259 


6053 


584 


2370 


4.156 


594 2 


784CIP2B 260 


6060 


585 


2371 


4157 


5943 


784CIP2B 261 


6063 


586 


2372 


4158 


5944 


784CIP2B 262 


6066 


587 


2373 


4159 


£945 


784CIP2B 263 


6067 


588 


23 74 


4160 


5946 


784CIP2B 264 


6068 


589 


2375 


4161 


5947 


784CIP2B 265 


6073 


590 


2376 


4162 


5948 


784CIP2B 266 


6076 


591 


2377 


4163 * 


5949 


784CIP2B 267 


6076 


592 


2378 


4164 


5950 


784CIP2B 268 


6077 


593 


2379 


4165 


5951 


784CIP2B 269 


6079 


594 


2380 


4166 


5952 


784CIP2B 270 


6082 




2381 


4167 


5953 


784CIP2B 272 


6088 


596 j 


2362 


4168 


5954 


784CIP2B 273 


6091 


597 


2303 


4169 


5955 


784CIP2B 274 


6094 


598 


2384 


4170 


5956 


784CIP2B 275 


6101 


599 


2385 


4171 


5957 j 


784CIP2B 276 


6103 


600 


2386 


4172 


5956 


784CIP2B 277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


6108 


S02 


23 8 8 


4174 


5960 


784CIP2B_279 


6112 


603 


2389 


4175 


5961 


784CIP2B_280 


6121 


604 


2330 j 


4176 


5962 


784CIP2B 281 


6125 


605 


2391 


4177 


5963 


784CIP2B_262 


6126 


606 


2392 


4178 


5964 


784CIP2B_283 


6128 


507 


2393 


4179 


5965 


784CIP2B 284 


6129 


608 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5967 | 


784CIP2B_2B6 


6133 


610 


2396 


4182 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139 


«12 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4185 


5971 


784CIP2B_290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5973 


784CIP2B_292 


6148 


616 


2402 


4188 


5974 


784CIP2B_293 j 


6149 


617 


2403 . 


4189 


5975 


764CIP2B 294 


6149 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SBQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of con tig 

peptide 

sequence 


Priority 
docket huraber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


NO: in 
U.S. S.N. 
09/488, 725 


618 


2404 


4190 


5976 


784CIP2B 295 


6153 


619 


2405 


4191 


5977 


784CIP2B_296 


j 61S9 


620 


2406" 


4192 


5978 " 


784CIP2B 297 


6164 


621 


2407 


4193 


I 5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B 299 


6172 


623 


| 2409 


4195 


5981 


784CIP2B_300 


6173 


624 


2410 


4196 


5982 


784CIP2B 301 


6190 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4190 


5984 


784CIP2B_303 


6196 ~ 


627 


2413 


4199 


5985 


784CIP2B_304 


6197 


628 


2414 


4200 


5986 


784CIP2B 305 


6198 


629 


2415 


4201 


5987 


784CIP2B_3 06 


6198 


630 


2416 


4202 


5988 


784CIP2B_308 


6214 


631 


2417 


4203 


5989 


784CIP2B_309 


6215 


632 


2418 


; 4204 


5990 


784CIP2B_310 


6219 


633 


2419 


; 4205 


5991 


784CIP2B 311 


6226 


634 


2420 


4206 


5992 


784CIP2B_312 


6229 


635 


2421 


4207 


5993 


784CIP2B_313 


6234 


636 


2422 


4208 


5994 


784CIP2B 314 


6237 


637 


2423 


4209 


" 5995 


784CIP2B 315 


6238 


638 


2424 


4210 


5996 


784CIP2B_316 


6239 


639 


2425 


4211 


5997 


784CIP2B_317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


784CIP2B_319 


6240 


642 


2428 


4214 


6000 


784CIP2B_320 


6244 


643 


2429 


4215 


6001 


784CIP2B 321 


6245 


I 644 


2430 


4216 


6002 


784CIP2B_322 


6250 


645 


2431 ■ 


4217 


6003 


784CIP2B_323 


6252 


646 


243 2 ~ 


4218 - 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B 325 


6256 


648 


2434 


4220 


6006 


784CIP2B 326 


6260 


649 


2435 


~ 4221 


6007 


784CIP2B 327 


6261 


650 




4222 


6008 


764CIP2B_328 


6264 


651 


2437 


4223 


6009 


784CIP2B 329 . 


6265 


652 


2430 


4224 


6010 


784CIP2B_330 


6266 


653 


2439 


4225 


6011 


784CIP2B 331 


6270 


654 


244 0 


4226 


6012 


784CIP2B 332 


6271 


655 


2441 


4227 


6013 


784.CIP2BJJ34 


6274 


656 


2442 


4228 


6014 


784CIP2B 335 


6276 1 


657 


2443 


4229 


6015 


784CIP2B_336 


6281 


658 


2444 


4230 


6015 


784CIP2B 337 


6281 


659 


2445 


4231 


6017 


784CIP2B 338 


6288 


660 


2446 


4232 


6013 


784CIP2B 339 


6292 


661 


2447 


4233 


6019 


784CIP2B 340 . 


6294 


662 


2448 . 


4234 


6020 


784CIP2B 343—- 


6312 


663 ; 


2449 


4235 


6021 


784CIP2B 344 


6312 


664 


24S0 


4236 


6022 


784CIP2B 345 


6312 


665 


2451 


4237 


6023 


784CIP2B 346 


6322 


666 


2452 


4238 


6024 


784CIP2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


6329 


668 


2454 


4240 


6026 


784CIP2B 350 


6331 


669 


2455 


4241 


6027 


784CIP2B 351 


6333 


670 


2456 


4242 


6028 


784CIP2B 352 


6334 i 


671 


2457 


4243 


6029 


784CIP2B 353 


6-j37 


672 


2458 


4244 


6030 


7B4CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


7S4CIP2B 357 


6348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


679 


2465 


4251 


6037 


784CIP2B_361 j 6362 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding"" 
SEQ ID NO: in. 
priority- 
application 


SEQ ID 
NO: in 
U.S.S .N. 
09/488,725 


580 


2466 


4252 


6038 


784CIP2B 362 


6368 


681 


j 2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2468 


4254 


6040 


784CIP2B 364 


6371 


683 


2469 


4255 


6"041 


784CIP2B 365 


6376 


684 


2470 


4256 


6042 


784CIP2B 366 


6379 


685 


2471 


4257 


| 6043 


784CIP2B 367 


| 6380 


686 


2472 


4258 


6044 


784CIP2B 368 


6381 


687 


2473 


4259 


6045 


784CIP2B_369 


6392 


688 


2474 


4260 


6046 


784CIP2B 370 


6395 


689 


2475 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


6048 


784CIP2B 372 


6400 


691 


2477 


4263 


6049 


7B4CIP2B_373 


6401 


692 


2478 


4264 


6050 


704CIP2B 374 


j 6411 1 


693 


i 2479 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


7B4CIP2B_376 


6411 


695 


2481 


4267 


6053 


784CIP2B 377 


6416 


696 


2482 


4268 


6054 


784CIP2B 378 


6418 


697 


2483 


4269 


6055 


7 84CIP2B_379 


6422 


696 


2484 


4270 


6056- 


784CIP2B_380 


6423 


699 


2485 


4271 


6057 


784CIP2B 381 


6426 


700 


2486 


4272 


6058 


784CIP2B 382 


6427 


701 


2487 


4273 


6059 


784CIP2B 383 


6428 


702 


2498 


4274 ' 


6060 


784CIP2B 384 


6429 


703 


2489 


4275 


6061 


784CIP2B 385 


6430 " 


704 


2490 


4276 


6062 


784CIP2B_386 


6432 


705 


2491 


4277 


6063 


784CIP2B 387 


6432 


706 


2492 


4278 


6064 


784CIP2B 388 


6438 


707 


2493 


4279 


6065 


784CIP2B 389 


6441 


708 


2494 


4280 


6066 


784CIP2B 390 


6446 


709 


2495 


4281 


6067 


784CIP2B 391 


6454 


710 


2496 


4282 


6068 


784CIP2B_3 92 


6459 


711 


2497 


4283 


6069 


784CIP2B 394 


6461 


712 


2498 


4284 


6070 


784CIP2B_395 


6467 


713 


2499 


4285 


6071 


784CIP2B 396 


6468 j 


714 


2500 


4206 


6072 


784CIP2B 397 


6487 j 


715 


2501 


4287 


6073 


784CIP2B_398 


6491 


716 


2502 


4288 


6074 


784CIP2B 399 


6S6£ 


717 


2503 


4289 


607* ■ 


784CIP2B 401 


"6S14 


718 


2504 


" 4290 


6076 


784CIP2B 402 


6519 


719 


2505 


4291 


6077 ■ 


784CIP2B 403 


6521 


720 


2506 


""" 4292 


6078 


784CIP2B 404 


6532 


721 


2507 


4293 


6079 


784CIP2B 405 


6536 


722 


2508 


4294 


6080 


784CIP2B 406 


6543 


723 


2509 


4295 


6081 ■ 


784CIP2B_407 


6544 


724 


2510 


4296 


6082 


7B4CIP2B 408 


654 8 


725 


2511 


4297 


6083 


784CIP2B_409 


6551 


726 


2512 


4298 


6084 


784CIP2B 410 


6*551 


727 


2513 


4299 


6085 


7B4CIP2B 411 


6552 


728 


2514 


4300 


6086 


784CIP2B 412 


6554 


729 


2515 


4301 


6087 


784CIP2B 413 


6*556 


730 j 


2516 


4302 


" 6088 


7B4CIP2B_414 


6560 


731 


2517 


4303 


6089 


784CIP2B 415 


6563 


732 


2518 


4304 


6090 


784CIP2B 416 


6564 1 


733 


•5 15 1 o 


4305 


6091 


784CIP2B 417 i 


6567 


734 


2520 


4306 


" 6092 


784CIP2B_418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 I 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 


738 


2524 


4310 


6096 


784CIP2B_422 


6595 


739 


2525 


4311 


6097 


784CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 j 6099 


784CIP2B 425 


6625 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO; of 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


742 


2528 


4314 


6100 


784CIP2B_426 


6626 


743 


2529 


4315 


6101 


784CIP2B 427 


6630 


744 


2530 


4316 


6102 


784CIP2B 428 


6631 


745 


2531 


4317 


6103 


784CIP2B 429 


6632 


mo 


2532 


4318 


6104 


784CIP2B_430 


6633 


1 A 1 


2533 


4319 


6105 


784CIP2B 431 


6634 


74 8 


2534 


4320 


6106 


784CIP2B 432 


6638 


749 


2535 


4321 


6107 


784CIP2B 433 


6641 


750 


2536 


4322 


6108 


784CIP2B 434 


6644 


751 


2537 


4323 


6109 


784CIP2B 435 


6646 


752 


2538 


4324 


6110 


7B4CIP2B 436 


6648 


753 


2539 


4325 


6111 


784CIP2B 437 


| 6652 


754 


2540 


4326 


6112 


784CIP2B 438 


66S4 


755 


2541 


4327 


6113 


784CIP2B 439 


6657 


756 


2542 


4328 


6114 


784CIP2B 440 


6658 


757 


2543 


4329 


6115 


784CIP2B_441 


6663 


758 


2544 


4330 


6116 


784CIP2B 442 


6664 


•759 


2545 


4331 


6117 


784CIP2B_443 


66G8 


760 


2546 


4332 


6118 


784CIP2B 444 


6669 


761 


2547 


4333 


6119 


784CIP2B 445 


6673 


762 


2548 


4334 


6120 


784CIP2B_446 


6685 


763 


2549 


4335 


6121 


784CIP2B 447 


6687 


764 


2550 


4336 


6122 


784CIP2B 448 


6689 


765 


2551 


4337 


6123 


784CIP2B 449 


6693 


766 


25S2 


4338 


6124 


784CIP2B 450 


6698 


767 


2553 


4339 


6125 


784CIP2B_451 


6699 


768 


25S4 


4340 


6126 


784CIP2B_452 


6705 


769 


2555 


4341 


6127 


784CIP2B 453 


6711 


770 


2556 


4342 


6128 


784CIP2B 454 


67.13 


771 


2557 


4343 


6129 


784CIP2B 455 


6716 


772 


2558 


4344 


6130 


7B4CIP2B 456 


6725 


773 


2559 


4345 


6131 


784CIP2B_4S7 


6726 


774 


2560 


4346 


6132 


784CIP2B -458 


6727 


775 


2561 


4347 


6133 


784CIP2B 459 


6730 


776 


2562 


4348 


6134 


784CIP2B 460 


6730 


777 


2563 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 


4350 


6136 


784CIP2B 462 


6732 I 


779 


2565 


4351 


6137 


784CIP2B 463 


6733 


780 


2566 


4352 


6136 


784CIP2B 464 


6737 


781 


2567 


4353 


S139 


784CIP2B_4 65 


6745 


782 


256B 


4354 


6140 


784CIP2B_466. 


6751 ' 


783 


2569 


4355 


6141 


784CIP2B_467 


5754 


784 


2570 


4356 


6142 


784CIP2B 468 


6758 


785 


2571 


4357 


6143 


784CIP2B_469 


6761 


786 


2572 


4358 


6144 


784CIP2B 470 


6765 


787 


2573 


4359 


6145 


784CIP2B_471 


5768 


788 


2574 


4360 


6146 


784CIP2B 472 


6773 


789 


2575 


4361 


• 6147 


784CIP2B_473 


6776 


790 


2576 


4362 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_475 


6798 


792 


. 2578 


4364 


6150 


784CIP2B 476 


6823 


793 


2579 


4365 


6151 


7B4CIP2B_477 


6825 


794 


2580 


4366 


6152 


784CIP2B 478 


6826 


795 


2581 


A\ai 

nJD / 


6153 


784CIP2B 479 


6839 


796 


2582 


436B 


6154 


784CIP2B 480 


6B44 


797 


2583 


4369 


6155 


784CIP2B 482 


6849 


798 


2584 


4370 


6156 


784CIP2B_4 83 


6854 


799 


2585 


4371 


6157 


784CIP2B_4 84 


6857 


800 


2586 


4372 


615B 


784CIP2B_48S 


6861 


801 


2587 


4373 


6159 


784CIP2B_486 


"'" 6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 


6161 


7B4CIP2B_488 


6877 



283 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 

cz Am ion a 

cjccjueiice 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket nurnber 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


804 


*7 can 


4376 


6162 


784CIP2B_489 


6880 


80S 




4377 


6163 


784CIP2B 490 


6885 


• 806 




4378 


6164 


784CIP2B_491 


6890 


807 




4379 


6165 


784CIP2B 492 


6890 


808 


2594 


4380 


6166 


784CIP2B 493 


6894 


809 




4381 


6167 


784CIP2B_494 


6901 


810 


259S 


4382 


6168 


784CIP2B_4 95 


6904 


811 


2597 


4383 


6169 


784CIP2B 496 


6907 


812 


2598 


4384 


6170 


784CIP2B 497 


6914 


OX J 


2599 


4385 


6171 


784CIP2B 498 


6917 


ftT A 

Ol'i ■ 


2600 


4386 


6172 


784CIP2B 499 


6923 




2601 


4387 


6173 


784CIP2B 500 


6929 


PI C 
Dig 


2602 


4388 


6174 


784CIP2B 501 


6931 


O.L / 


2603 


4389 


6175 


784CIP2B 502 


6935 


818 


2604 


4 390 


6176 


784CIP2B 503 


6940 


819 


2 605 


4391 


5177 


784CIP2B 504 


6945 


820 


2606 


4392 


6178 


784CIP2B_505 


6946 


B21 

1Pv5 


2607 


4393 


6179 


784CIP2B 506 


6947 


BZ2 

poi : 


2608 


4394 


6180 


784CIP2B 507 


6949 


o<23 


2609 


4395 


6181 


784CIP2B 508 


6959 


824 


2610 


4396 • 


6182 


784CIP2B 509 


6960 


825 


2611 


4397 


£163 


784CIP2B 510 


6962 


826 


2612 


439B 


6184 


784CIP2B__511 


6963 


827 


2613 


4399 


6185 


784CIP2B_512 


6967 


828 


2614 


4400 


6186 


784CIP2B_513 


6983 


829 


2615 


4401 


6137 


784CIP2B_514 


6988 


830 


2616 


4402 


6138 


784CIP2B 515 


6996 


831 


2617 


4403 


6139 


784CIP2B 516 


7003 


832 


2618 


4404 


6190 


784CIP2B 517 


7016 


833 


2619 


4405 


6191 


784CIP2B 518 


7017 


834 


2620 


4406 


6192 


784CIP2B 519 


7025 


835 


2621 


4407 


6193 


784CIP2B 520 


7025 


836 


2622 


4408 


6194 


784CIP2B 521 


7025 


837 


2623 


4409 


6195 


784CIP2B 522 


7050 


838 


2624 


4410 


6196 


784CIP2B 523 


7051 


839 


2625 


4411 


6197 


784CIP2B 524 


7055 


840 


2626 


4412 


6198 


784CIP2B 525 


7060 


841 


2627 


4413 


6199 


784CIP2B 526 


7064 


842 


2628 


4414 


6200 


784CIP2B 527 


7067 


843 


2629 


4415 


6201 


784CIP2B 528 


7071 


a a a 
044 


Zb30 


4416 


6202 


784CIP2B 529 


7072 


a AC 
OS 3 


'2631 


4417 


6203 


784CIP2B 530 


7073 


a ac 

O 4 t> 


2632 


4418 


6204 | 


784CIP2B 531 


7076 


84 7 


2633 


4419 


6205 [ 


784CIP2B 532 


7088 


848 




4420 


6206 ] 


784CIP2B 533 


7089 


0417 


263 5 i 


4421 


6207 


784CIP2B_534 


7091 


850 


2636 


4422 


6208 


784CIP2B 535 


7091 


851 


2 637 


4423 


6209 


784CIP2B 536 


7104 




""ygTn 

2o 3 8 


4424 


6210 


784CIP2B 537 


7105 


853 


2 63 9 


4425 


6211 


784CIP2B_538 


7105 


OCX 

034 


2640 


4426 


6212 


784CIP2B_539 


7109 


ft C C 
ODD 


2641 


4427 


6213 


784CIP2B 540 


7109 


ODD 


2642 


4428 


6214 


784CIP2B 541 


7119 


857 


2643 


4429 


621 5 


— 7fl<irTp->n c/n 

/b^Liirzo 542 


7120 


858 


2644 


4430 


6216 


784CIP2B 543 


7121 


859 


2645 


4431 


6217 


784CIP2B_544 


7126 


860 


264£ 


4432 


6218 


784CIP2B_545 


7127 


861 


2647 


4433 


6219 


784CIP2B 546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 1 


6221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B_549 


7159 


865 " 


2651 


4437 


6223 


784CIP2B 550 


7163 
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SEQ ID NO: 
of full- 


SEQ ID 
NO: of 


SEQ ID NO: 
of contig 


' SEQ ID 

NO: 


Priority 

doc Ice t nurabe r 


SEQ ID 
NO: in 


length 
nucleotide 


full- 
length 


nucleotide 
sequence 


of contig 
peptide 


corresponding 
SEQ ID NO: in 


U.S. S.N. 
09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




866 


2652 


4438 


62 24 


784CIP2B 551 


7175 


867 


2653 


4439 


6225 


784CIP2B 552 


7188 


868 


2654 


4440 


6226 


784CIP2B 553 


7189 


869 


2655 


4441 


6227 


784CIP2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B 555 


7191 


871 


2657 


4443 


6229 


784CIP2BJJS6 


7203 


872 


2658 


4444 


6230 


784CIP2B 557 


7204 


873 


2659 


4445 


6231 


784CIP2B 558 


7208 


874 


2660 


4446 


6232 


784CIP2B 559 


7209 


" " 875 


2661 


4447 


6233 


784C*P2B 560 


7210 


876 


2662 


4448 


6234 


784CIP2B 561 


7216 


877 


2663 


4449 


6235 


784CIP2B 562 


7221 


878 


2664 


4450 


6236 


784CIP2B_563 


7230 


879 


2665 


4451 


6237 


784CIP2B_564 


7237 


880 


2666 


4452 


6238 


784CIP2"B_56£ 


7240 


881 


2667 


4453 


6239 


784CIP2B_566 


7245 


882 


2668 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B_568 


7251 


884 


2670 


4456 " 


6242 


784CIP2"B 569 


7255 


885 


2671 


4457 


6243 


784CIP2B 570 


7260 


886 


2672 


4458 


6244 


784CIP2B_571 


7265 


887 


2673 


4459 


6245 


784CIP2B 572 


7268 


888 


2674 


4460 


6246 


784CIP2B_573 


7275 


889 


2675 


4461 


6247 


784CIP2B 574 


7279 


890 


2676 


4462 


6248 


784CIP2B_575 


7283 ! 


"■ 891 


2677 


4463 


6249 


784CIP2B_576 


7283 


" 892 


2678 


4464 


62S0 


784CIP2B 577 


7287 


893 


2679 


4465 


£251 


784CIP2B 578 


7301 


894 


2680 


4466 


6252 


784CIP2B_579 


73 08 


B95 


2681 


4467 


6253 


784CIP2B_5B0 


7308 


896 


2682 


4468 


• 6254 


784C1P2B_581 


7309 


897 


26-8* 


4 4 69 


6255 


784CIP2B_582 


7319 


898 


2684 


4470 


6256 


784CIP2B 583 


7320 


899 


2665 


4471 


6257 


784CIP2B 584 


7326 


900 


2686 


4472 


6258 


784CIP2B 585 


7326 


901 


2687 


4473 


6259 


784CIP2B 586 


7334 


902 


2688 


4474 


6260 


784CIP2B 587 


7337 


903 


2689 


4475 


6261 


784CIP2B_588 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 


4477 


'6263 


784CIP2B_590 


7355 


906 


2692 


447S 


6264 


784CIP2B 591 


7363 ' 


907 


2693 


4479 


6265 


784CIP2B_592 


7363 


908 


.2694 


4480 


6266 


784CIP2B_593 


7365 j 


909 


2695 


4481 


6267 


784CIP2B_594 


7368 


910 


2696 


4482 


6268 


784CIP2B_595 


7369 | 


911 


2697 


4483 


6269 


784CIP2B 596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B_601 


7383 


915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B 603 


7391 


917 


2703 


4469 


6275 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B_605 


7395 


919 


2705 


4491 


6277 


784CIP2B 606 


7397 


920 


2706 


4492 


6278 


7B4CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 


922 


2708 


4494 


6280 


784CIP2B 609 


740£ 


923 


2709 


4495 


6281 


784CIP2B 610 


7406 


924 


2710 


4496 


6282 


784CIP2B 611 


7409 


925 


2711 . 


4497 


6283 


784CIP2B 612 


7410 


925 


2712 


4496 


6284 


784CIP2B 613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 
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WO 01/53312 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SKQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SSQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket nunber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
N0:in 
U.S. S.N. 
09/488,725 


528 


2714 


4500 


6286 


784CIP2B_615 


7418 


929 


2715 


4501 


6287 


784CIP23 616 


7421 


J 930 


2716 


j 4502 


6288 


784CIP2B 617 


7422 


931 


2717 


4503 


6289 


784CIP23 618 


7422 


932 


[ 2718 


4504 


6290 


784CIP2B 619 


| 7423 


1 93 3 


2719 


4505 


6291 


784CIP23 620 


7424 


934 


2720 


4506 


6292 


784CIP23_621 


7426 


935 


2721 


4507 


6293 


784CIP23 622 


7427 


936 


2722 


4508 


6294 


784CIP23_623 


7428 


93 7 


2723 


4509 


6295 


784CIP23 624 


7430 


93 8 


2724 


4510 


6296 


784C1P23 625 


7435 


939 


2725 


4511 


6297 


784CIP2B 626 


7437 


940 


L 2726 


■ 4512 


6298 


784CIP23_627 


7439 


941 


2727 


4513 


6299 


784CIP2B 628 


7440 


942 


2728 


4514 


6300 


784CIP23 629 


7442 


943 


2729 


4515 


6301 


784CIP2B_630 


7450 


944 


2730 


4516 


6302 


784CIP23 631 


7451 


945 


2731 


4517 


6303 


784CIP2B 632 


7452 


946 


2732 


4518 


6*304 


784CIP23 633 


7454 


947 


2733 


4519 


6305 


784CIP2B_634 


7457 


948 


2734 


4520 


6306 


784CIP2B_635 


7459 


949 


2735 


4521 


6307 


784CIP2B__636 


7461 


950 


2736 


4522 


6308 


784CIP2B 637 


7463 


951 


2737 


4523 


6309 


784CIP2B 638 


7466 


952 


2738 


4524 


6310 


784CIP2B_€39 


7469 


953 


2739 


" 4525 


6311 


784CIP23 640 


74 73 


954 


2740 


4526 


6312 


784CIP2B_641 


7481 


955 


2741 


4527 


6313 


784CIP2B 642 


7482 


956 


2742 


4528 


6314 


784CIP2B 643 


7482 


957 


2743 


4529 


6315 


784CIP2B_644 


7483 


958 


2744 


4530 


6316 


784C3P2B_645 


7485 


959 


2745 


4531 


6317 


784CIP2B_646 


7486 


960 


2746 


4532 


6316 


784CIP2B_647 


7487 


961 


2747 


4533 


6319 


784CIP2B_648 


7491 


962 


2748 


4534 


6320 


784CIP2B 649 


7492 


963 


2749 


4535 


6321 


784CIP2B 650 


" 7494 


964 


2750 


4536 


6322 


784CIP23 651 


7498 


965 


2751 


4537 


6323 


784CIP2B 652 


7504 


966 


2752 


453 8 


6324 


784CIP23_653 


750B 


967 


2753 


4539 


6325 


784CIP2B 654 


7516 


968 


2754 


4540 


6326 


784CIP2B 6^5 


7518 


969 


2755 


4541 


63 2 7 


784CIP2B 656 


7519 


970 


2756 


4542 


6328 


784CIP2B_657 


7521 


971 


2757 


4543 


6329 


784C3P23 658 


7529 


972 


2758 


4544 


6330 


784CIP2B_659 


7532 


973 


2759 


4545 


6331 


784CIP23 660 


7533 


974 


2760 j 


4546 


6332 


784CIP2B_661 


7535 


975 


2761 


454 7 


6333 


784CIP2B_662 


7545 


976 


2762 


4548 


6334 


784CIP2B_663 


7546 


977 


2763 


4549 


6335 


784CIP2B 664 


7552 


978 


2764 


4550 


6336 


784CIP2B_665 


7554 


979 


2765 


4551 


6337 


784CIP2B 666 


7567 


980 


2766 


4552 


6338 


784CIP23_6'6'7 


7569 


981 


2767 


4553 


6339 


784CIP2B_668 


7575 


982 


2768 


4554 


6340 


784CIP23 669 


7576 > 


983 


2769 


4555 


6341 


784CIP23_670 


7577 


984 


2770 


4556 


6342 "■ 


784CIP2B_671 


7579 


985 


2771 


4557 


6343 


784CIP23 672 


7582 


986 


2772 


4558 


6344 


784CIP2B 673 


7587 ! 


987 


2773 


4559 


6345 


784CIP23 674 


7589 


988 


2774 


4S60 


6346 


784CiP2B_675 


7597 


989 


2775 


4561 


6347 


784CIP2B 676 


" 7597 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 

full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NOi in 
U.S. S.N. 
09/488, 725 


990 


2776 


4562 


6348 


784CIP2B_677 


7609 


991 


2777 


4563 


6349 


784CIP2B 678 


7609 


992 


2778 


4564 


6350 ; 784CIP2B 679 


7609 


993 


2779 


4565 


6351 


784CIP2B 680 


7613 


994 


2780 


4566 


6352 


784CIP23 681 


7623 


99S 


2781 


4567 


6353 


784CIP23 682 


7629 


996 ■ 


2732 


4568 


6354 


784CIP2B 683 


7630 


997 


2783 


4569 


6355 


7B4CIP2B 684 


7633 


998 


2784 


4570 


6356 


784CIP2B_685 


7635 


999 


2785 


4571 


6357 


784CIP2B_686 


7638 


1000 


2786 


4572 


6358 


784CIP2B 687 


7639 


1001 


2787 


4573 


63fe9 


784CIP2B 688 


7646 


1002 


2788 


4574 


6360 


784CIP2B 689 


7647 


1003 


27Q9 


! 4575 


6361 


784CIP2B 690 


7648 


1004 


2790 


4576 


6362 


784CIP2B 691 


7658 j 


1005 


2791 


4577 


6363 


784CIP2B_692 


7664 | 


1006 


2792 


4578 


6364 


784CIP2B_693 


7664 


1007 


2793 


4579 


6365 


784CIP2B 695 


7674 " 


1008 


2794 


4580 


6366 


784CIP2B 696 


767^ 


1009 


2795 


4581 


636"7 


784CIP2B 697 


7676 


1010 


2796 


4582 


6368 


784CIP2B 698 


7681 j 


.1011 


2797 


4583 


6369 


784CIP2B 699 


7668 


1012 


2 798 


4584 


6370 


784CIP2B 700 


7693 j 


1013 


2799 


4585 


6371 


784CIP2B 701 


7694 


1014 


2800 


4586" 


6372 


784CIP2B 702 


7715 


l6lfe 


2801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2802 


4588 


6374 


784CIP2B 704 


7718 


1017 


2803 


45B9 


6375 


784CIP2B_705 


7721 


1018 


2804 


4590 


6376 


784CIP2B 706 


7723 


1019. 


2805 


4591 


6377 


784CIP2B 707 


7729 


1020 


2606 


4592 


6378 


784CIP2B 708 


7733 


1021 


2807 


4593 


6379 


784CIP2B_709 


7735 


1022 


2808 


4S94 


6380 


7B4CIP2B_710 


' 7741 


1023 


2809 


4595 


6381 


784CIP2B_711 


7743 


1024 


2810 


4596 


6382 


784CIP2B 712 


7748 


1025 


2811 


4597 


6383 


784CIP2B 713 


7749 


1026 


2812 


4598 


6384 


784CIP2B 714 ' 


7750 


1027 


2813 


4599 


6385 


784CIP2B 715 


7757 


1028 


2814 


4600 


6386 


784CIP2B_716 


7759 


1029 


2815 


• 4601 


6387 


784CIP2B 717 


7760 


1030 


2816 


4602 


6388 


784CIP2B_718 


7760 


1031 


2817 


4603 


6389 


7B4CIP2B 719 


7764 


1032 


2818 


4604 


6390 


764CIP2B_720 


7765 


1033 


2819 


4605 


6391 


784CIP2B_721 


7766 


1034 


2820 


4606 


6392 


784CIP2B 722 


7767 


1035 


2821 


4607 


6393 


784CIP2B_723 


7769 


1036 


2622 


4608 


6394 


784CIP2B 724 


7770 


1037 


2823 


4609 


6395 


784CIP2B 725 


7774 


1038 


2824 


4610 


6396 


784CIP2B 726 


7779 


1039 


2825. 


46"11 


6397 


784CIP2B 727 


7781 


104 0 


2826 


4612 


6398 


784CIP2B_728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


7783 


1042 


2828 


4614 


6400 


7B4CIP2B_730 


7787 




2B29 


4615 


6401 


784CIP2B 731 


7792 


1044 


2830 


4616 


6402 


784CIP2B_732 


7795 


1045 


2831 


4617 


6403 


784CIP2B 733 


7801 


1046 


2832 


4618 


6404 


784CIP2B_734 


7807 


1047 


2833 


4619 


6405 


784CIP23 735 


7808 


1048 


2834 


4620 


6406 


784CIP23 736 


7819 


1049 


2835 


4621 


6407 


784CIP2B_737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


4623 


6409 


7B4CIP2B 739 


7829 
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s>fc.Q id NO: 

rtf Piil 1 
D[ EUJ.J.— 

* * ^ * W V i- X VhI^. 

sequence 


SEQ ID 
NO : of 
full- 
length 

sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1052 


283d 


4624 


64 10 


784CIP2B 740 


7832 


" ' 1053 


2839 


4625 


6411 


784CIP2B 741 


7839 ~ 


1054 


2840 


AO C 

H O £. o 


64 12 


784CIP2B 743 


7847 


1055 


2841 


4627 


6413 


784C1P2B 744 


7848 


1056 


2847 


4628 


b4 Xl 


784CIP2B 745 


7853 


1057 


2843 


4 629 


6415 


784CIP2B 746 


7854 


1058 


2844 


4 630 


6416 


764CIP2B 747 


7856 


1059 


2845 


40 J J. 


6417 


784CIP2B 748 


7862 


1060 


2846 




6418 


784CIP2B 749 


7865 


1061 


I 2847^ 


4 b J J 


6419 


784CIP2B 750 


7874 


1062 


2848 


4634 


6420 


784CIP2B 751 


7877 


1063 


2849 


4635 


6421 


784CIP2B_752 


7880 


1064 


2850 


4636 


6422 


784CIP2B 753 


7882 


1065 




4637 


6423 


784CIP2D 754 


7884 


1066 




4638 


6424 


784CIP2B 755 


7886 


1067 




4639 


6425 


784CIP2B_756 


7888 


1068 




4640 


6426 


784CIP2B__757 


7889 


1069 


£ ODD 


4641 


6427 


784CIP2B 758 


7901 


1070 




4642 


6428 


784CIP2B 759 


7910 


1071 


6Q3 / 


4643 


6429 


784CIP2BJ760 


7911 


1072 ' 




4644 


6430 


784CIP2B 76"l 


7921 


1073 




4645 


6431 


784CIP2B 762 


\ 7923 


1074 


z obU 


4646 


6432 


704CIP2B 763 


7924 


1075 




4647 


6433 


784CIP2B 764 


7925 


1076 


? a Co 
z ob z 


4648 


6434 


7B4CIP2B 765 


} 7928 


1077 


Z Bo J 


4649 


6435 


| 784CIP2BJ766 


7929 


1078 


*) a<c/i 


4650 


6436 


784CIP2B_767 


7930 


1079 




4651 


6437 


784CIP2B 768 


7934 ~1 


1080 


£3bb 


4652 


6438 


784CIP2B_769 


7938 


1081 


2 367 


4653 


6439 


784CIP2B 770 


7942 


1082 




4654 


6*440 | 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2B 772 


7946 


1084 


o n 7n 

*o /U 


4656 


6442 


784CIP2B 773 


7948 


1085 


O 0 *7 1 


4657 


6443 


784CIP2B 774 


7951 


1086 


2872 


4658 


6444 


784CIP2B 775 


7952 


10&7 


9 Q "7 1 


46S9 


6445 


784CIP2B 776 


7953 


1088 


28 74 


4660 


6446 


784CIP2B 777 


7954 


1039 




4661 


6447 


784CIP2B 778 


7957 


1090 


28 76 


4662 i 


6448 


7B4CIP2B 77"9 


7958 


1091 


2877 


4 bbi 


6*449 


784CIP2B 780 


7961 


1092 


2878 


ACCA 

4 664 


6450 


784CIP2B 781 


" 7965 


1093 


2879 


4665 


6451 


784CIP2B 782 


7966 


1094 


28 80 


4666 i 


6452 


784CIP2B 783 


7979 


1095 


2881 


ac en 
** b b / 


6453 


784CIP2B 784 


7986 


1096 


2882 


A C CO 

*± bo o 


6454 


784CIP2B 785 


7986 


1097 


28 83 




6455 


784CIP2B 786 


7988 


1098 


2884 


i b / u 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


cTTTEn 

6457 


784CIP2B 788 


7992 


1100 


2886 


4672 


6458 


784CIP2B 789 


7992 


1101 


2887 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4 674 


6460 


7B4CIP2B_79l 


7992 


1103 


2889 


4 675 


6461 


784CIP2B 792 


8003 


1104 


2890 


A etc 


64 62 


784CIP2B 793 


8014 


1105 


2891 


4677 


6463 


784CIP2B 794 i 




1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 


4679 


6465 


784CIP2B_796 


8017 


1108 
1109 


.2894 
2895 


4680 
4681 


6466 
6467 


784CIP2B 797 


8019 


1110 


2896 


4682 


6468 


784CIP2B_798 
784CIP2B 799 


8020 
8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B 801 


8028 


1113 


2899 


4685 


6471 


784CIP2B 802 


8030 
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SEQ ID NO: 
of full- 
length 
nucleotide 

com i ttn#«A 

sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priori ty 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1114 


2900 


4686 


j 6472 


7B4CIP2B 803 


8038 


1115 


2901 


4687 


! 6473 


784CIP2B 804 


8042 


! 1116 


2902 


4688 


6474 


784CIP2B 805 


8045 


1117 


2903 


4689 


6475 


784CIP2B 606" 


8045 


111B 




4690 


6476 


784CIP2B_607 


E 8046 


1119 


2905 


4691 


6477 


784CIP2B_808 


8047 


1120 


2906 


4692 


6478 


784CIP2B_809 


8051 




2907 


; 4693 


6479 


784CIP2B 810 


8059 


1122 


290 8 


4694 . 


6480 


784CIP2B 811 


8064 


1 1 


2909 


[ 4695 


6481 


784CIP23_812 


8069 




2910 


4696 


6482 


784CIP23_813 


8074 


1125 

i T)C " — 


2911 


4697 


6483 


784CIP2B_814 


8077 




2912 


4698 


6484 


784CIP2B 815 


8078 


1127 


2913 


4699 


6485 


784CIP2B 816 


8079 


1128 


2914 


4700 


6466 


784CIP2B_817 8084 


1129 


2915 


4701 


1 6487 


784CIP2B 818 8088 


1130 


2916 


4702 


6488 


784CIP2B 819 ~ 


8090 


1131 


2917 


4703 


6489 


784CIP2B 820 


8091 


1132 


2918 


4704 


6490 


784CIP2B 821 


8099 


1133 


2919 


4705 


6491 


784CIP2B 822 


8099 


1134 


2920 


4706 


' 6492 


784CIP2B_823 


8100 


1135 


2921 


4707 


6493 


784CIP2B 824 


8102 


i 1136 


2922 


4708 


6494 


784CIP2B_825 


8103 


■1137 


2923 


4 709 


6495 


784CIP2B_826 


8103 


1138 


2924 


4710 


6496 


784CIP2B 827 


8104 


1139 


292S 


4711 


6497 


784CIP2B 828 


8108 


1140 


2926 


4712 


6498 


784CIP2B_829 


8110 


114 1 


2927 


4713 


6499 


784CIP2B 830 


8116 


1142 


2928 


4714 


6500 


784CIP2B 831 


8117 


1143 


2929 


4715 


6501 


7B4CIP2B_832 


8123 


1144 


2930 


4716 


6502 


784CIP2B 833 


8130 


1145 


2931 


4717 


6503 


784CIP2B 834 


8130 


1146 


2932 


4718 


6504 


784CIP2B_835 


8143 


114 7 

, i . „ 


2933 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4720 


6506 


784CIP2B 837 


8154 


1149 


2935 


4721 


6507 


784CIP2B 838 " 


8155 


1150 


293 6 


4722 


6508 


784CIP2B 839 


8162 


1151 


2937 


4723 


6509 


784CIP2B 840 


8l6"3 


1152 

TTCt 


2938 


4724 


6516 " 


784CIP2B 841 


8172 




293 9 


4725 


6511 


784CIP2B 842 


8173 




2940 


4726 


6512 


784CIP2B_843 


8179 


1155 


2941 


4727 


6513 


784CIP2B 844 


8182 


1156 
no 


2942 


4728 


6514 


784CIP2B 845 


8183 


lis / 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 




294 5 


4 731 


6517 


784C1P2B 848 


8187 


inn 

1 XlOv 


2946 


4732 


6518 


784CIP2B 849 


8188 


| 1161 ' — " 


2947 


4733 


6519 


"784CIP2B 850 


8190 




2948 


4734 


6520 


784CIP2B 851 


8190 


1163 


2949 


4735 


6521 


784CIP2B 852 


8192 


1 1 


2950 


4736 


6522 


784CIP2B 853 


8193 


ii£s 


2951 


4737 


6523 


784CIP2B 854 " 


8197 


llOO 


2952 


4738 


6524 


784CIP2B 855 


8197 


1167 


2953 


4739 




784CIP2B 856 


8199 


1168 


2954 


4740 


*526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B_858 


8203 


1170 


2956 


4742 


6528 


7B4CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


2959 


4745 


6531 


784CIP2B_862 


8214 


1174 


2960 


4746 


6532 


784CIP2B B63 


8217 


1175 


2961 


4747 


6533 


784CIP2B 864 


8223 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ lb 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Prior/ A fcy 
docket number 
corre spending 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1176 


-2962 


4748 


6534 


784CIP2B 865 


8224 


1177 


2963 


4749 


6535 


784CIP2B_866 


8226 • 


1178 


2964 


4750 


6536 


784CIP2B_867 


8227 


1179 


2965 


4751 


6537 


784CIP2B 868 


8229 


1180 


2966 


4752 


6538 


784CIP2B 869 


8232 


1181 


2967 


4753 


S539 


784CIP2B 870 


8236 


1182 


2968 


4754 


6540 


784CIP2B_871 


8239 


1103 


2969 


4755 


6541 


784CIP2B 872 


8244 


1184 


2970 


47S6 


6542 


784CIP2B 873 


8245 


1185 


2971 


4757 


6543 


784CIP2B_874 


8248 


1186 


2972 


4758 


6544 


784CIP2B_875 


8251 


1187 


2973 


4759 


6545 


784CIP2B 876 


8253 


1188 


2974 


4760 


6546 


784CIF2B877 


B260 


1189 


2975 


4761 


6547 


784CIP2B 878 


8262 


1190 


2974 


4762 


6548 


784CIP2B_879 


8268 


1191 


2977 


4763 


6549 


784CIP2B 880 


8270 


[ 1192 


2978 


4764 


6550 


784CIP2B 881 


8272 


1193 


2979 


4765 


6551 


784CIP2B_882 


8274 


1194 


2980 


4766 


6552 


784CIP2B 883 


8274 


1195 


2981 


4767 


6553 


784CIP2B 884 


8275 


1196 


2982 


4768 


6554 


784CIP2B 885 


8277 ! 


1197 


2983 


4769 


6555 


784CIP2B 886 


8281 


1198 


2984 


4770 


6556 


784CIP2B 887 


8283 


1199 


2985 


4771 


6557 


7B4CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 . 


4773 


6559 


784CIP2B 890 


8300 


1202 


2988 


4774 


SS^O 


784CIP2B 891 


8303 


1203 


2989 


4775 


6561 


784CIP2B 892 


8304 


1204 


2990 


4776 


6562 


784CIP2B 893 


8305 


1205 


2991 


4777 


6563 


784CIP2B 894 


8309 


1206 


2992 


4778 


6564 


784CIP2B_895 


8318 


1207 


2993 


4779 


6565 


784CIP2B 896 


8319 


1208 


2994 


4780 


656G 


784CIP2B 897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


7B4CIP2B 899 


8323 


1211 


2997 


4783 


6S69 


784CIP2B 900 


8325 


1212 


2998 


4784 


6570 


784CIP2B 901 


8331 


1213 


2999 


4785 - 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784CIP2B_903 


""" 8333 


1215 


3001 


4787 


6573 


784CIP2B_904 


8335 


" 1216 


3002 


4788 


6574 


784CIP2B 905 


8336 


1217 


3003 


4789 


6575 


784CIP2B_906 


8337 


1218 


3004 


4790 


6576 


784CIP2B_907 


8340 


1219 


3005 


4791 


6577 


784CIP2B 908 


" 8343 " 


1220 


3006 


4792 • 


6578 


784CIP2B_909 


8347 


1221 


3007 


4793 


6S79 


784CIP2B 910 


8349 


1222 


3006 


4794 


6580 


784CIP2B_911 


8351 


1223 


3009 


4795 


6581 


784CIP2B 912 


8353 


1224 


3010 


4796 


6582 


784CIP2B 913 


8355 


1225 


3011 


4797 


6583 


784CIP2B_914 


8361 


1226 


3012 


4798 J 


6584 


784CIP2B_915 


8365 


1227 


3013 


4799 


6585 


784CIP2B_916 


8367 


1228 


3014 


4800 


65B6 


784CIP2B_917 


8369 


1229 


3015 


4801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8367 


1231 


3017 


4803 


• 6589 


784CIP2B_921 


~ 8391 


1232 


3018 


4804 


6590 


784CIP2B_922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


~ 8393 


1234 


3020 | 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 


6593 


784CIP28_925 


8395 


1236- 


3022 


4808 


6594 


784CIP2B_926 


8396" 


1237 


3023 


4809 


6595 


784CIP2B 927 


8398 
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SEQ 10 NO: 
of full- 
length 
nucleotide 
Geouence 


SEQ ID 
NO: of 
full- 
length 
pept ide 
sequence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket r.umber_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1238 


3024 


4810 


6596 


784CIP2B_928 


8402 


1239 


3 025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 
TA-TH 


4812 


6598 


784CIP2B 930 


I 8405 


1 24 1 


3027 


4813 


6599 


784CIP2B 931 


8406 


1242 


3028 


4814 


6600 


784CIP2B 932 


8409 




3029 


4815 


6601 


784CIP2B_933 


8410 


1 Til t 


3030 


4816 


6602 


784CIP2B 934 


8414 . 


J..* «* J 


3031 


4817 


6603 


784CIP2B_935 


8415 




3032 


4818 


6604 


784CIP2B 936 


6419 


1247 


3033 


4819 


6605 


784CIP2B 937 


8426 


1248 


3034 


4820 


6606 


784CIP2B 938 


8430 


1249 


3035 


4821 


6607 


784CIP2B 939 


8431 


1250 


3036 


4822 


6608 


784CIP2B 940 


8432 


1251 


303 7 


4823 


6609 


784CIP2B 941 


8433 


1252 


3038 


4B24 


6610 


784CIP2B 942 


8434 


1253 


3039 


4B25 


6611 


784CIP2B 943 


8438 


1254 


3040 


4826 


6612 


7B4CIP2B 944 


8439 


1255 


3041 


4827 


6613 


784CIP2B_945 


8441 


1256 


3042 


4B28 


6614 


784CIP2B_946 


8450 


1257 


3043 


4829 


£615 


7B4CIP2B 947 


8451 


1258 


3044 


| 4830 


6616 


784CIP2B_94 9 


B452 


1259 


3045 


4831 


6617 


784CIP2B_949 


8460 


1260 


3046 


4832 


6618 


784CIP2B 950 


8461 


1261 


3047 


4833 


6619 


784CIP2B 951 


8462 


1262 


3049 


4834 


6620 


784CIP2B 952 


84G4 - 


1263 


3049 


4835 


6621 


784CIP2B 953 


8465 


1264 


3050 


4836 


6622 


784CIP2B_954 


8467 


1265 


3051 


4B37 


6623 


784CIP2B 955 


8470 


1266 


3052 


4838 


£624 


784CIP2B 956 


8471 


1267 


3053 


4 839 " 


6625 


784CIP2B 957 


8473 


1268 


3054 


4840 


6626 


784CIP2B 958 


8474 


1269 


3055 


4841 


" 6627 


784CIP2B 959 


8475 


1270 


3056 


4842 


6628 


784CIP2B 960 


8476 ■ - 


1271 


3057 


4843 


6629 


7B4CIP2B 961 


8480 


1272 


3058 


4844 


6630 


784CIP2B_962 


8482 


1273 


3059 


4845 


6531 


784CIP2B_963 


8482 


1274 


3060 


4846 


6632 


784CIP2B 964. 


8486 


1275 


3061 


4847 


6633 


784CIP2B_965 


0408 


1276 


3062 


4648 


6634 


784CIP2B_966 


8492 


1277 


3063 


4849 


6635 


784CIP2B 967 


8494 


1278 


3064 


4850 


6636 


784CIP2B_96 8 


8496 


1279 


3065 


4851 


6637 


784CIP2B 969 


8497 


1280 ■ 


3066 


4852 


6638 


784CIP2B 970 


8499 


-L<db 1 


3067 


4 853 


6639 


784CIP2B 971 


8513 




3068 


4854 


6640 


784CIP2B 972 


8522 


12TT3 


3069 


4855 


6641 


784CIP2B 973 


8526 


1 O DA 


3070 


4856 


6642 


784CIP2B_974 


8531 


i£D3 


3071 


4857 


6643 


784CIP2B_975 


8533 


1 TOT 


3072 


4858 


6644 


784CIP2B 976 


8542 


/ 


3073 


4859 


6645 


784CIP2B 977 


8544 




3074 


4860 


6646 


784CIP2B 978 


856S 




3075 


4861 


6647 


784CIP2B 979 


8565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572 


1291 


307^7 


4863 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B_983 


8584 


1294 


3080 


4866 


6652 


784CIP2B 984 


" &S98 - 


1295 


3081 


4867 


6653 


784CIP2B 985 


8602 


1296 


3082 


4868 


6654 


784CIP2B 986 


8604 


1297 


3083 


4869 


6655 


784CIP2B 987 


8609 


1298 


3084 


4870 


6656 | 


784CIP2B 988 


8612 


1299 


3085 


4871 


6*657 


784CIP2B 989 


8637 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
cor re spon d i ng~ 
SEQ ID NO: in 
priority 
application 


NO : in 
U.S. S.N. 
09/488,725 


1300 


3086 


4872 


6658 


7Q4CIP2B 990 


8640 


1301 


3087 


4873 


6659 


784CIP2B 991 


8643 


1302 


3088 


4874 


6660 


784CIP2B 992 


8645 


1303 


3089 


4875 


; 6661 


784CIP2B 993 


8650 


1304 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B_995 


8654 


1306 


3092 


4878 


6664 


784CIP2B 996 


8655 


1307 


3093 


4879 


6665 


784CIP2B_997 


8657 


1308 


3094 


4880 


6666 


784CIP2B 998 


8665 


1309 


3095 


4861 


6667 


784CIP2B 999 


8668 


1310 


3096 


4882 


6668 


784CIP2B 1000 


8671 


1311 


3097 


4883 


£469 


784CIP2B 1001 


8672 


1312 


3098 


4884 


6670 


784CIP2B 1002 


8692 | 


1313 


3099 


4885 


6671 


784CIP23 1003 


87C6 


1314 


3100 


4886 


6672 


784CIP23 1004 


8716 


1315 


3101 


4887 


6673 


784CIP2B 1005 


8719 


1316 


3102 


4888 


6674 


784CIP2B 1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 


1318 


3104 


4890 


6676 


784CIP2B_1008 


67^4 


1319 


3105 


4891 


6577 


784CIP2B 1009 


8764 


1320 


3106 


4892 


6678 


784CIP2B 1010 


8774 


1321 


3107 


4893 


6679 


784CIP2B 1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B 1012 


8796 


l 1323 


3109 


4895 


6681 


7B4C*P2B 1013 


8827 


1324 


3110 


4896 


6682 


784CIP2B 1014 


8842 


1325 


U- 3111 


4897 


6683 


784CIP2B 1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B 1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B 1617 


8B71 


1328 


3114 


4900 


668fT " 


784CIP2B 1018 


8921 


13^29 


3115 


" 4901 


6687 


784CIP2B 1019 


8927 


1330 


3116 


4902 


6688 


784CIP2B 1020 


8942 


1331 


3117 


4903 


6669 


784CIP2B 1021 


8994 


1332 


3118 


4904 


£690 


784CIP2B 1022 


9023 


1333 


3119 


4905 


6691 


784CIP2B 1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B 1024 


9056 


1335 


3121 


4907 


6693 


784CIP2B 1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4909 


4*95 


784CIP2B 1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B 1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B 1029 


9084 


1340 


3126 


4912 


6698 


784CIP2B_1030 


9093 


1341 


3127 


4913 


£699 


784C2P2B 1031 


9101 


1342 


3128 


4914 


6700 


784CIP2B_1032 


9103 


1343 


3129 


4915 


67 01 


/04L1P2B 1033 


9105 


1344 


3130 


4916 


6702 


7B4CIP2B_1034 


9151 


1345 


3131 


4917 


£703 


784CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B 1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B_1037 


9174 


1348 


3134 


4920 


6706 


784CIP2B 1038 


9204 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


784CIP2B 1040 


9235 


1351 
1352 


.3137 
3138 


4923 
4924 


6709 
6710 


784CIP2B 1041 


9239 


1353 




4925 


£711 


784CIP2B_1042 
7B4CIP2B 1043 


9256 
9276 


1354 


3140 


4926 


6712 


784C1P2B 1044 


5345 ] 


1355 


3141 


4927 


6713 


784CIP2B 1045 


9379 


1356 


3142 


4928 


6714 


784CIP2B 1046 


9435 


1357 


3143 


4929 


6715 


784CIP2B 1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B 1049 


9500 


1360 1 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 - 


784CXP2B ldTl 


9520" 
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SEQ ID NO: 
ot tuii- 

1 annt-K 

J-cngtn 


SEQ ID 
NO: of 
full- 
length 
peptide 

Defence 


SEQ ID NO: 
of con tig 
nucleotide 
sequence 


SCQ ID 

NO: 

of con tig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1362 


3148 




6720 


784CIP2B 1052 


9541 


1363 


3 145 


4935 


6721 


784CIP2B_1053 


9541 


1364 


3 150 


4936 


6722 


784CIP2B 1054 


9548 




3151 


4937 


6723 


784CIP2B 1055 


9556 


1366 


3 152 


493 8 


6724 


784CIP2B_1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B 1057 


9575 


1368 


3 154 


4940 


6726 


784CIP2B 1058 


9589 


1363 




4941 


6727 


784CIP2B 1059 


9599 


1370 


3156 


4942 


6728 


[ 784CIP2B_1060 


9602 


1371 




4943 


6729 


784CIP2B 1061 


9606 


1372 


11 CD 


4944 


6730 


784CIP2B 1062 


9622 


1373 


11 CO 


4945 


6731 


784CIP2B 1063 


9623 


1374 




494 6 


6732 


784CIP2B 1064 


9646 


1375 




4 947 


6733 


784CIP2B 1065 


974 7 


1376 


J 1 X> £, 


4948 


6734 


784CIP2B 1066 


9773 


13 77 


J lb J 


4 94 9 


6735 


784CIP2B 1067 


9785 


1 O 


3164 


4950 


£736 


784CIP2B 1068 


9801 


1379 




4951 


6737 


784CIP2B 1069 


9811 


i 1380 


n cc 

J lob 


4952 


6738 


784CIP2B 1070 


9843 




3167 


'4953 


6739 


784CIP2B 1071 


9854 




3168 


4954 


6740 


784CIP2B 1072 


9854 


1383 


j 16 y 


4955 


6741 


784CIP2B 1073 


9864 


13 84 


inn 


4956 


6742 


784CIP2B 1074 


9864 


13 85 


3171 


4957 


6743 


784CIP2B 1075 


9871 


1 1 n£ 
X J oo 


3172 


4958 


6744 


7B4CIP2B 1076 


9879 


13 8 7 


3173 


4959 


6745 


784CIP2B 1077 


9881 


i 1388 


3 174 


4960 


674 6 


784CIP2B 1078 


9885 


1389 


i\nc 

3175 


4961 


674 7 


784CIP2B 1079 


9901 


13 90 


3 17S 


4962 


6748 


784CIP2B 1080 


9912 




3177 


4 963 


6749 


784CIP2B 1081 


9916 


T "ICO 


3178 


4964 


6750 


784CIP2B 1082 


9921 


1393 


3179 


4965 


6751 


784CIP2B 1083 


9925 


13 94 


3180 


4966 


6752 


784CIP2B 1084 


9930 




3161 


4967 


6753 


784CIP2B 1085 " 


9949 


13 96 


3182 


49b'B 


67S4 


784CIP2B 1086 


9951 




3183 


4969 


6755 


784CIP2B 1087 


9959 


1398 


3184 


4970 


6756 


784CIP2B 1088 


9973 


1399 


3185 


4971 


6757 


784CIP2B 1089 


9982 


1400 


3186 


4972 


675B 


784CIP2B 1090 


9994 


1401 


J ±o / 


4973 


6759 


784CIP2B 1091 


10021 


1402 


3188 


4974 


5760 


784CIP2B 1092 


10041 


1403 


3189 


4975 


6761 


784CIP2B 1094 


10067 


1404 


3190 


4976 


6762 


784CIP2B_1095 


10073 


1405 


3 191 


A 071 


6763 


784CIP2B 1096 


10112 


1406 


3192 


4978 


6764 


784CIP2B 1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B_1098 


10132 


1408 


3194 


4 980 


6766 


784CIP2B 1099 


10169 


1409 


3 195 


4 y a i 


6767 


784CIP2B 1100 


10217 


1410 


3196 


4 982 


6768 


784CIP2B 1101 


10226 


1411 


3197 


4983 


6769 


784CIP2B 1102 


10232 


1412 


-11 QD 

Jl36 


4984 


6770 


784CIP2B 1103 


10237 


1413 


3199 


4985 


6771 


784CIP2B 1104 


10279 


1414 


vU 


4986 


6772 


784CIP2C_1 


33 


1415 


3201 


4987 


6773 




2 71 


1416 


3202 


4988 


6774 


784CIP2C 3 


84 6 


1417 


3203 


4989 


6775 


784CIP2C_4 


849 


1418 


3204 


4990 


6776 


784CIP2C_5 


864 


1419 


3205 


4991 


6777 


784CIP2C 6 


953 


1420 


3206 


4992 


6778 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 


784CIP2C_8 


1595 


1422 


3208 


4994 


6780 


784CIP2C 9 


1697 


1423 


3209 


4995 


6781 


784CIP2C 10 


1744 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 




3210 


4996 


6782 


784CIP2C 11 


» 1937 


14 2S 


3211 


4997 


6783 


784CIP2C 12 


1955 




3212 


4998 


6784 


784CIP2C_13 


1955 


1427 


3213 


4999 


; 6785 


784CIP2C_14 


2185 


1428 


3214 


5000 


6786 


784CIP2C_15 


2889 


14 29 


3215 


5001 


6787 


784CIP2C 16 


2901 


14 30 


3216 


5002 


6788 


784CIP2C 17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2905 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


784CIP2C_21 


2959 


143S 


3221 


5007 


6793 


784CIP2C_22 


f 29£$ 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C 24 


2970 


1430 


3224 


5010 


6796 


784CIP2C_25 


2985 


1439 


3225 


5011 


6797 


784CIP2C_26 


2987 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


784CIP2C_28 


2993 


1442 


3228 


5014 


6800 


784CIP2C_29 


3017 


1443 


3229 


5015 


6801 


784CIP2C 30 


3046 


1444 


3230 


5016 


6802 


784CIP2C_31 


3050 


1445 


3231. 


5017 


6803 


784CIP2CJJ2 


3357 


1446 


3232 


5018 


6804 


784CIP2C_33 


3359 


1447 


3233 


5019 


6805 


784CIP2C 34 


3432 


1448 


3234 


5020 


6806 


784CIP2C 35 


3438 


1449 


3235 


5021 


6807 


784CIP2C 36 


3439 


1450 


3236 


5022 


6B08 


784CIP2C_39 


3463 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6810 


784CIP2C_41 


3466 


1453 


3239 


5025 


6311 


784CIP2C 42 


3467 


1454 


3240 


5026 


6B12 


784CIP2C_43 


3466 ~~ 


1455 


3241 


5027 


6813 


784CIP2C_44 


3483 


1456 


3242 


502B 


6B14 


784CIP2C_45 


3484 


1457 


3243 


5029 


6813 


784CIP2C 46 


3488 


1458 


3244 


5030 


6816 


784CIP2C 47 


3491 


1459 


3245 


5031 


6817 


784CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C 49 


3494 


1461 


3247 


5033 


6819 


784CIP2C 50 


3495 


1462 


3248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3503 


1464 


3250 


5036 


6822 I 


784CIP2C_53 


3503 


1465 


3251 


503 7 


6823 


784CIP2C 54 


3504 ( 


1466 


3252 


5038 


6824 


784CIP2C_55 


3511 


i Ad 

1467 


3253 


5039 


6825 


784CIP2C_5.6 


3531 


1468 


3254 


5040 


6826 


784CIP2C S7 


3536 


1469 


3255 


5041 


6H27 


784CIP2C 58 


3546 


14 7 U 
1 a oh 


3256 


5042 j 


6828 


784CIP2C_S9 


3548 


JL1 / 1 


3257 


5043 


6829 


784CIP2C 60 


3551 


1 AT) 


■3258 


5044 


6830 


784CIP2C 61 


3553 


X 1 * / J 


. 3259 


5045 


6 831 


784C2P2C 62 


3564 


14 74 


3260 


5046 


6832 


784CIP2C 63 


3567 


1475 


3261 


5047 


6033 


7B4CIP2C_64 


3572 


1476 


3262 


5048 


6834 


784CIP2C 65 


3573 


1477 


3263 


504 9 


6835 


784CIP2C 66 


3574 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3266 


5052 


6838 


784CIP2C 69 


3623 


1481 


3267 " 


5053 


6839 


784CIP2C 70 


3629 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


7B4CIP2C 72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C_74 


" 3912 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1486 


3272 


5058 


6844 


784CIP2C 75 


3924 


1487 


3273 


5059 


6845 


784CIP2C 76 


3928 


1488 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


7B4CIP2C_78 


3959 


1490 


3276 


5062 


6848 


784CIP2C 79 


3981 


1491 


3277 


5063 


6849 


784CIP2C 80 


3989 


1492 


3278 


5064 


6850 


784CIP2C_81 


4295 


1493 


3279 


5065 


6851 


784CIP2C 82 


4300 


1494 


3280 


5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


6853 


784CIP2C_84 


4362 


1496 


3282 


5068 


6854 


784CIP2C 85 


4371 


• 1497 


3283 


5069 


6855 


784CIP2C 86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C 89 


4378 


1500 


3286 


5072 


6858 


784CIP2C 90 


4382 


1501 


3287 


5073 


6859 


784CIP2C 91 


4409 


• 1502 


3288 


5074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C 93 


4421 


1504 


3290 


5076 


6862 


784CIP2C_94 


4426 


1505 


3291 


5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6B65 


784CIP2C 97 


4436 


1508 


3294 


5080 


6866 


784CIP2C 98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1510 


3296 


5082 


6668 


784CIP2C100 


4441 


1511 


3297 


5083 


6869 


j 784CIP2C_101 


4442 


1512 


3298 


5084 


*870 


784CIP2C_102 


4455 


1513 


3299 


5085 


6971 


784CIP2C 103 


4462 


1514 


3300 


5086 


6872 


784CIP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6374 


784CIP2C 106 


4477" 


1S17 . 


3303 


5089 


6875 


784CIP2C 107 


4481 


1518 


3304 


5090 


6876 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


784CIP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


6379 


784CIP2C_111 


4490 


1522 


3308 


5094 


6880 


784CIP2C 112 


4499 


1523 


3309 


5095 


6861 


7B4CIP2C 113 


' 4503 


1524 


3310 


5096 


6882 


784CIP2C 114 


4506 


1525. 


3311 


5097 


6883 


784CIP2C 115 


4509 


1526 


3312 


5098 


6884 


784CIP2C 116 


4514 


1527 


3313 


5099 


6885 


784CIP2C_117 


4516 


1528 


3314 


5100 


6686 


784CIP2C 118 


4522 


1529 


3315 


5101 


6887 


7B4CIP2C_119 


4525 


1530 


3316 


5102 


6888 


784CIP2CJ120 


4527 


1531 


3317 


5103 


6889 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


7B4CIP2C_122 


4529 


1533 


3319 


5105 


6891 


784CIP2C_123 


4532 


1534 


3320 


, 5106 


6892 


784CIP2C 124 


4537 


1535 


3321 


5107 


6893 


7B4CIP2C_125 


4538 


1536 


3322 


5108 


6894 


784CIP2C 126 


4551 


1537 


3323 


5109 


6895 


784CIP2CJ127 


4552 


1538 


" "3324 


5110 


6896 


784CIP2C 128 


4559 


1539 


3325 


5111 


6897 


784CIP2C_129 


4567 


1540 


3326 


5112 


6896 


784CIP2C_130 


4568 


1541 


3327 


S113 


6B99 j 


7B4CIP2C_132 


4585 


1542 


3328 


5114 


690O 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 [ 


784CIP2C_135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4617 


154* 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


5119 


6905 


784CIP2C 138 


4620 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SZQ ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SBQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1548 


3334 


5120 


6906 


784CIP2C 139 


4624 


1549 


3335 


5121 


6907 


784CIP2C 140 


4632 


1550 


3336 


5122 


6908 


7B4CIP2C 141 


4634 


1551 


3337 


5123 


6909 


784CIP2C_142 


4638 


1552 


3338 


5124 


6910 


784CIP2C 143 


4639 


1553 


3339 


5125 


6911 


j 784CIP2C_144 


4643 


1554 


3340 


5126 


6912 


784CIP2C 145 


4644 


1555 


3341 


5127 


6913 


784CIP2CJL46 


4655 


1556 


3342 


5128 


6914 


784CIP2C 147 


4668 


1557 


3343 


5129 


6915 


784CIP2t 148 


4677 


1558 


3344 


5130 


6916 


784CIP2C 149 


4677 


1559 


3345 


. 5131 


G917 


784CIP2C 150 


4677 


1560 


3346 


5132 


6918 


784CIP2CJL52 


4682 


1561 


3347 


5133 


6919 


784CIP2C 153 


4690 


1562 


3348 


5134 


6920 


784CIP2C 154 


4691 


1563 


3349 


5135 


6921 


784CIP2C_155 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


4730 


1565 


3351 


5137 


6923 


784CIP2C_157 


4734 


1566 


3352 


5138 


6924 


" 784CIP2£ 158 


4757 


1567 


3353 


5139 


6925 


784CIP2C_159 . 


4764 


1568 


3354 


5140 


6926 


784CIP2C 160 


4786 


1569 


3355 


5141 


6927 


784CIP2C 161 


4793 


1570 


335S 


5142 


6928 


784CIP2C_162 


4825 


1571 


3357 


5143 


6929 


784CIP2C 163 


4826 


1572 


3358 


5144 


6930 


784CIP2CJL64 


4850 


1573 


3359 


5145 


j 6931 


784CIP2C 165 


4853 


1574 


3360 


5146 


6932 


784CIP2C 166 


4855 


1575 


3361 


5147 " 


6933 


• 784CIP2C_167 


4856 


•157* 


3362 


5148 


6934 


784CIP2C 168 


4867 


1577 


3363 


5149 


6935 


784CIP2C_169 


4869 


1578 


3364 


5150 


6936 


784CIP2C 170 


4878 


1579 


3365 


5151 


6937 


784C1P2C 171 


4880 


1580 


3366 


5152 ^ 


6938 " 


784CIP2C_172 


4942 


1581 


3367 


5153 


6939 


784CIP2C 173 


4945 


1582 


3368 


5154 


6940 


784CIP2C_174 


4950 


1583 


3369 


5155 


•6941 


784CIP2C 175- 


4952 


1584 


3370 


5156 


6942 


784CIP2C_176 


4954 


1585 


3371 


5157 


6943 


784CIP2C 177 


4958 


1586 


3372 


5158 


6944 


784CIP2C_17B 


4961 


1587 


3373 


5159 


6945 


784CIP2C_179 


5590 


1588 


3374 


£l£6 


6946 


784CIP2C_180 


55*99 


1589 


3375 


5161 


6947 


784CIP2C 181 


5692 


1590 


3376 


5162 


6948 


784CIP2C 182 


5732 


1591 


3377 


5163 


6949 


784CIP2C 183 


5765 


1592 


3378 


5164 


£950 


784CIP2C_184 | 


5771 


1593 


3379 


5165 


6951 


784CIP2C 185 


5774 


1594 


3380 


5166 


6952 


784CIP2C 186 


5793 


. 1595 


3381 


5167 


6953 


784CIP2C 187 


5806 


1596 


3382 


5168 


6*9^4 


784CIP2C 188 


5852 


1597 


3383 


5169 


6955 


784CIP2C 189 


5892 J 


1598 


3384 


5170 


6956 


784CIP2C 190 


6057 


1599 


3385 


5171 


69S7 ; 


784CIP2C 191 


6061 


1600 


3386 


5172 


6958 


7B4ClP2C_192 


6109 


lbUl 


3387 


5173 


6959 


784CIP2C 193 


6160 


1602 


3388 


5174 J 


6960 


784CIP2C_194 


6297 


1603 


3389 


5175 


6961 


784CIP2C_195 


6398 


1604 


3390 


5176 


6962 


784CIP2C_196 


6398 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C 198 


6448 


1607 


3393 


5179 


6965 


784CIP2C 199 


6469 


160B 


3394 


5180 


6966 


784CIP2C 200 


6476 


1609 


"3395 " 


5181 


6967 


784CIP2C 201 


6561 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SFO ID PJO • 

of con tig 

nucleotide 

sequence 


SEQ ID 

NO: 

of con tig 

peptide 

sequence 


Priority 

corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
nU : in 
U .S .S .N. 
09/48R 73^ 


1610 


3396 • 


5182 


6968 


784CIP2C_202 


6574 


1611 


3397 


5183 


6969 


784CIP2C_203 


6578 


1612 


3398 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


5185 


6971 


704CIP2C 205 


6672 


1614 


3400 


1 5186 


6972 


784CIP2C_206 


6691 


1615 


3401 


5187 


6973 


784CIP2C 207 


6695 


1616 


3402 


5188 


6974 


784CIP2C 208 


6746 


1617 


3403 


5189 


6975 


784CIP2C 209 


6898 


1618 


3404 


5190 


6976 


784CIP2C 210 


6938 


1619 


3405 


5191 


6977 


784CIP2C 211 


! 6943 


1620 


3406 


5192 


6978 


7B4CIP2C 212 


7110 i 


1621 


3407 


5193 


6979 


784CIP2C 213 


7200 


1622 


3408 


5194 


6980 


784CIP2C 214 


7212 


1623 


3409 


5195 


6981 


784CIP2C 215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 216 


7249 


1625 


3411 


5197 


6983 


784CIP2C 217 


7500 


1626 


3412 


5198 


6984 


784CIP2C 218 


7509 


1627 


3413 


5199 


6985 


784CIP2C 219 


7523 


162B 


3414 


5200 


6986 1 


784CIP2C 220 


7544 


1629 


3415 


5201 


6987 


784CIP2C 221 


7564 


1630 


3416 


5202 


6988 


784CIP2C 222 


7568 


1631 


3417 


5203 


6989 


784CIP2C 223 


7631 


1632 


3418 


5204 


6990 


784CIP2C 224 


7813 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C 227 


7 907 


1636 


3422 


5208 


6994 


784CIP2C 228 


7943 


1637 


3423 


5209 


6995 


784CIP2C 229 


8175 


1638 


3424 


5210 


6996 


784CIP2C 230 


8216 


1639 


3425 


5211 


6997 


784CIP2C 231 


8225 


1640 


3426 


5212 


6998 


784CIP2C 232 


8271 


1641 


3427 


5213 


6999 


784CIP2C 233 


83 97 


1642 


3428 


5214 


7000 


784CIP2C 234 


8466 


1643 


3429 


5215 


7001 


784CIP2C 235 


8503 


1644 


3430 


5216 


7002 


784CIP2C 236 


8953 


1645 


3431 * 


" 5217" 


7003 


784CIP2C 237 


9106 


1646 


3432 


5218 


7004 


?84CIP2e 2"38 


913 9 i 


1647 


3433 


521 9 


7005 


784CIP2C 239 


955S 


1648 


3434 


5220 


7006 


784CIP2C_240 


9650 


1649 


3435 


5221 


7007 


784CIP2C 241 


9889 


1650 


3 4 36 


5222 


7008 


784£lP2C±_242 


493* 


1651 


3437 


5223 


7009 


784CIP2C_243 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784CIP2D 1 


746 


1654 


3440 


5226 


7012 


784C1P2D 2 


3$£& 


1655 


3441 


5227 


7013 


784CIP2D_3 


3558 


I 1656 


3442 


5228 


7014 


784CIP2D 4 


3633 


I 1657 


3443 


5229 


7015 


784CIP2D 5 


3658 


1658 


3444 


5230 


7016 | 


784CIP2D_6 


3732 


1659 


3445 


S231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 


784CIP2D 8 


4700 


1661 


3447 


5233 


7019 


784C1P2D 9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


784CIP2D 11 


4894 


16*4 


3450 


• 5236 


7022 


784CIP2D 12 


4918 


1665 


3451 


5237 


7023 


784CIP2D_13 


5159 


1666 


3 452 


5238 


" 7024 


784CIP2D 14 


7443 


1667 | 


3453 


5239 


7025 


784CIP2D_15 


8673 


1668 


3454 


5240 


7026 


784CIP2D 16 


8679 


1669 


3455 


5241 


702 7 


784CIP2D 17 


8727 


1670 


3455 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID NO: 
of full- 
length 
nucleotide 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
cor re spondi ng 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1672 




5244 


7030 


784CIP2D 20 


8018 


1673 


34 59 


5245 


7031 


784CIP2D 21 


8844 


1674' 


3460 


5246 


7032 


784CIP2D 22 


8846 


1 fi 7S 
iD/3 


3461 


5247 


7033 


784CIP2D 23 


8912 


1676 " 




5248 


7034 


784CIP2D 24 


8918 


1677 




5249 


7035 


784CIP2D 25 


8918 


1678 


j4o4 


5250 


7036 


784CIP2D 26 


8941 


1679 




5251 


7037 


784CIP2D 27 


8941 " 


1680 


J4bo 


5252 


7038 


784CIP2D 28 


8951 


1661 


J4b I 


5253 


7039 


784CIP2D 29 


8951 


' 1682- 


J ft bo 


5254 


7040 


784CIP2D 30 


9007 


XOOJ 


3469 


5255 


7041 


784CIP2D 31 


9012 


1684 


7A*in " 

34 /0 


5256 


7042 


784CIP2D 32 


9013 


XO O D 


34 71 


5257 


7043 


784CIP2D 33 


9025 


1 CDC 

XbOb 


34 72 


5258 


7044 


784CIP2D 34 


9053 


[ 1687 


3473 


5259 


7045 


7 84CIP2D_35 


9054 




3474 


5260 


7046 


784CJP2D 36 


9054 


1 coo 


3475 


5261 


7047 


7B4CIP2D 37 


9113 


1 Con 


3476 


5262 


7048 


784CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D 39 


9152 


1692 


3478 


5264 


7050 


784CIP2D 40 


9152 


1693 


3479 


5265 


7051 


7B4CIP2D 41 


9211 


1694 


3480 


5266 


7052 


784CIP2D 42 


9223 


1695 


3481 


5267 


7053 


784CIP2D_43 


9223 


1696 


3482. 


526B 


7054 


784CIP2D 44 


9231 


1697 


3483 


5269 


7055 


784CIP2D_4S 


9236 


1698 


34 84 


5270 


7056 


784CIP2D_46 


9236 ~f 


1699 


3485 


5271 


7057 


784CIP2D 47 


9303 


1700 


3486 


5272 


7058 


784CIP2D 48 


9309 


1701 


3487 


5273 


7059 


784CIP2D 49 


9314 


1702 


3488 


5274 


7060 


784CIP2D 50 " 


9326 


i mi 


3489 


5275 


7061 


784CIP2D 51 


9339 


X /U4 


34 90 


5276 


7062 


784CIP2D 52 


'"■■9348 


1705 


3491 


5277 


7063 


784CIP2D 53 


9376* 


1706 


3492 


5278 


7064 


784CIP2D_54 


9382 


1707 


3493 


5279 


7065 


7B4CIP2D 55 


9407 


X fuo 


3494 


5280 


7066 


784CIP2D_56 | 


9414 


x t\jy 


34 95 


5281 


7067 


784C1P2D 57 


9439 


v/in 
x /xu 


. 34 96 


5282 


7068 


784CIP2D 58 


9485 


1711 


34 97 


5283 


7069 


784CIP2D_59 


94 93 


1712 4 


3498 


5284 


7070 


784CiP2D_^0 


9501 


1 71 

X ' lJ 


3499 


5285 


7071 


784CIP2D_61 


9526 


1714 


3500 


5286 


7072 


784CIP2D 62 " 


9526 


1715 " 


"J CAT 


5287 


7073 


784CIP2D 63 


9551 


1716 


3 502 


5288 


7074 


784CIP2D 64 


9557 


1717 


ICrtl 

J juj 


5289 


7075 


784CIP2D_£5 


9568 


1 1718 


•JCn/l 
J jU4 


5290 


7076 


784CIP2D 66 


9588 


1719 


3 505 


5291 


7077 


784CI?2D_67 


9597 


1720 


3506 


5292 


7078 


784CIP2D 68 


9615 


1721 


3 50 7 


5293 


7079 


784CIP2D_69 


9628 


1722 


3 508 


5294 


7080 


784CIP2D_70 


9649 


IT)! 
X / j£ j 


3 509 


5295 


7081 


784CIP2D_71 


9652 




3510 


5296 


7082 


784CIP2D_72 


9660 


1725 


3511 


5297 


7083 




9&62 


1726 


3512 


5298 


7084 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2D_75 


9746 


1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


7087 


7S4CIP2D 77 


9787 i 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


5303 


7089 


784CIP2D79 [ 


9842 


1732 


3518 


5304 


7090 


784CIP2D_80 


9842 


1733 


3515 


5305 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 
of full- 
length 
nuc 1 eo t i de 
sequence 


SEQ ID 
NO: oE 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEO ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1734 


3520 


5306 


7092 


784CIP2D_82 


9867 


173*5 


3521 


5307 


7093 


7B4CIP2D 83 


. 10010 


1736 


3S22 


5308 


7094 


784CIP2D 84 


10011 


1737 


3523 


5309 


7095 


784CIP2D 85 


1O052 


X /JO 


3524 


5310 


7096 


784CIP2D 86 


10057 


1 TJ Q 
1/J7 


3525 


5311 


7097 


784CIP2D 87 


10085 




3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


7099 


784CIP2D 90 


1014"2 


1742 


3528 


5314 


7100 


784CIP2D 92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1 /44 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D_95 


10273 


1746 
nai 


3532 


5318 


7104 


784CIP2E 1 


3121 


1/4 7 


3 533 


5319 


7105 


784CIP2E 2 


3628 


1 *!M Q 


3534 


5320 


7106 


784CIP2E_4 


3673 


1749 

1 7ca 


3535 


5321 


7107 


784CIP2E_5 


4018 


1750 


3536 


5322 


7108 


784C:P2E & 


4467 


1751 


3537 


5323 


7109 


784CIP2E 7 


4865 


1752 


3 53 8 


5324 


7110 


1" 784CIP2E 8 


4916 


1753 


3539 


5325 


7111 


; 784CIP2E_9 


4923 


1 ~JtZA 

1 /54 


3540 


5326 


7112 


784CIP2E__10 


4926 


1755 


3541 


5327 


7113 


784CIP2E 11 


4962 


1756 


3542 


5328 


7114 


764CIP2E_12 


4963 


1757 


3543 


5329 


7115 


784CIP2E 13 


4964 


1758 


3544 


5330 


7116 


784CIP2E_14 


4988 


17S9 


3545 


5331 


7117 


784C1P2E 15 


5835 


1760 


3546 


5332 


7110 


784CIP2E_16 


7682 


1761 


3547 


5333 


7119 


784CIP2B_17 


7682 


1762 


3548 


5334 


7120 


784CIP2E 18 


7699 


1763 


3549 


5335 


7121 


784CIP2E 19 " 


7707 


1764 

t nerc 


3550 


5336 


7122 


784CIP2E 20 


7707 


1765 


3551 


5337 


7123 


784CIP2E 21 " 


7752 


1766 


3552 


5338 


7124 


784CIP2E 22 


6357 


1767 


3553 


5339 


7125 


784CIP2E_23 


9065 


1768 


3554 


5340 


7126 


784CIP2E_24 j 


9324 


1769 


3555 


5341 


7127 


784CIP2F 1 


2976 


1770 


3556 | 


5342 


7128 


784CIP2F_2 


3559 


1771 


3557 


5343 


7129 


784CIP2P 3 


4021 


1772 


3558 


5344 


7130 


784CIP2F 4 


4474 


1773 


3559 


5345 


7131 


784CIP2F_5 


4566 


1774 


3560 


5346 


7132 


784CIP2F 6 


4 705 


1775 


3561 


5347 


7133 


784CIP2F 7 


4707 


1776 


3562 


5348 


7134 


784CIP2F^8 


4712 


1 "7*7 "7 i 
JL / 1 / 


3563 


5349 


7135 


784CIP2F 9 


5008 


1778 


3564 


5350 


7136 


/ 04Lir£t< 10 


5009 ~~ 


1779 


3565 


5351 


7137 


784CIP2F 11 


5015" 


1780 


3566 


5352 


7138 


784CIP2? 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F 13 


7724 


1782 


3568 


5354 


7140 


784CIP2F_14 


7725 


1783 


3569 


5355 


7141 


784CIP2F_15 


8828 


1784 


3570 


5356 


7142 


784CIP2F 16 


8830 


1785 


3571 


5357 


7143 


784CIP2F17 


9739 


1786 


3572 


5358 


7144 


784CIP2F_18 


9696 



TRADOCS: 14 16247.1 (%CS701 !. DOC) 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, Dispart ic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=IfCucine; M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, VsValine, 
W= Tryptophan, Y=Tyrosine, X=Unknown. *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSALI LDEVAI LPAPQNLS VLSTNMKHLLMWS PVI APG~ 

ETVYYSVEYQGEYBSLYTSHIWIPSSWCSLTEGP3CDVTDDITA 

TVP YNLRVRATLGSQTS / CLEHP /VS I PLIETQ PSLPDL/RME I 

TKDGFHLVIBI.EDIJGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 

VHLETMEPGAAYCVKAQTFVKAIGRYSAFSQTECVEVQGEAIPL 

VLALFAFVGFML I LVWPLFVWKMGRLLQ/ YLLLPRGGSSQTPW 

KITQF 


5360 


2 


1115 


PRVRS SGGQB D P ASQQ WAR PRFTQ PS KMRR R VI AR P VGS S VRLK 
CVASGHPRPDITWMKDDQALrRPEAAEPRKKKWTIiSIjKNLRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTTS FQCKVRSDVKPVIQWLKRVEYGAEGRHNSTI D VGGQKF 
WLPTGDVWSRPDGS YLN KLLITRARQDDAGMY I CLGANTMGYS 
FRS AFLTVLPDPKPPGP P VASSSSATSLPW P WI GI PAGAVFIL 
GTLLLWLCQAQKKPCTPAPAPPLPGHRPPGTARDRSGDKDLPSL 
AALS AG PG VGLCE EHGS PAA PQH LLG PG P VAG PKL YP KIiYTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGSISSANI LLDDQFQ P KIrTDFAMAH FRSHLEHQSCT INMTS S 
SSKKLWYMPEEYIRQGKLSIXTDVYSFGIVIMEVLltSCRWLDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRNFSAZLFCL 
AGRCAATRAKLRPSMDEVLNTLBSTQASLYFAEDPPTSLKSFRC 
PSPLFLENVPS I PVEDDBSQNNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
liRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


5362 


2 


4879 


S CQ VEGCTRT YNS S QS IG KHM KTAH P DQ YAA FKMQR KS K KGQ rCA 
NNLKTPNNGKF VYFLPS P VNS SNPFFTSQTKANGNPACSAQLQH 
VSPPIFPAHLASVSTPLLSSMESVINPNITSQDKN3QGGMLCSQ 
MENLPSTAXjPAQMEDLTKTVLPLNIDRGSDPFIiSLPAESSSIDL 
FPSPADSGTNS VTSQLENNTNHYSSQI EGNTNS S FLKGGNGENA 
VFPSQVNVANNFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKKPAI IRDGKFICSRCYRAFTNPRSliGGHLS KRS YCKPLDGA 
EIAQELLQSNGQPSLLASMILSTNAVNLQQPQQSTFNPEACFKD 
PSFLQLLAENRSPAFLPNTFPRSGVTNFNTSVSQEGSE1IIQAL 
ETAG I P S TFEGAEMLSHVSTGC VS DAS QVNATVM PNPTVPPLLH 
TVCHPNTLLTNQWRTSNS KTSS I EECSSLPVFPTNDLLLKTVEN 
GLCSS S FPNSGGPSQNFTSNSSRVSVI SGPQNTRSSHLNKKGNS 
AS KRRKKVAPPLI APNASQNLVTSDLTTHGLI AKSVEI PTTNUI 
SNVIPTCEPQSLVENLTQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDS QMMALNS CTTS VNS DLQ I S E DNV I QNFE KT 
LEI I KTAMNSQILEVKSGSQGAGETSQNAQINYNIQLPS VNTVQ 
NNKLPDSS P \ FSS FI SVMPTESNI PQS E \ VSHKEDQ I QEILEGL 
QKLKLENDLSTPASQCVLINTSVTLTPTPVKSTADITVIQPVSE 
M I N IQ FNDKVN KP FVCQNQGCNYS AMT KDAL FKHYG K I HQYTP E 
M I LE I KKNQLKFA P F KC W PT CT KTFTRNSNLRAHCQLVHH FTT 
E EMVKLK I KRP YGRKSQSENV PAS RSTQVKKQLAMTE ENKKESQ 
PALELRAETQNTHSNVAVI PEKQLIEKKS PDKTESSLQV ITVTS 
EQCNTNALTNTQTKGRKIRRHXKEKEEICKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDIiPAFSAEVEE 
ESEAGKES EETETKQTLKE FRCQVSDCSR I FQAITGLIQHYMK1* 
HEMTPEE I ESMTAS VDVGKFPCDQLECKS S FTTYLMYVVHLEAD 
HGIGLRASKTEEDG VYKCDCEGCDR I YATRSNLLRHI FNKHNDK 
HKAHL I RPRRLT PG QENMS S KANQE KS KS KHRGT KHS RCGKEG I 
^PKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALSECTSRF VTQ Y P CM I KGCTS WTSE SN 1 1 RH YKCH KhS KAF 
TSQHRNLUVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTATVSQKEVBKNE*DEMDELTELFITKLINEDSTSVETQA 
NTSSWSNDFQEDNLCQSEl^QKASNLKRVNKEKNVSQNfCKRKVE. 
KAE P AS AAELSS VRKEE ETAVAI QT I EBH PAS FDWS S F KPMGFE 
VSFLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine. 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrcsine, X=Unknown. *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VI^KQLQEMKPTVSLKKLBVHSNDPDMSVMXDISIGKATGRGQY 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGAS KSKRQAQQMVQPQS P VAVS 
QS KPGCYDNGKHYQINQQWERTYLGNALVCTCYGGS RGFNCESK 
PEAEBTCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEHT 
CKP IAEKCFDHAAGTS YWGETWEKPYQGWMr^VDCTCLGEGSGR 
2 TCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNL LQC I CTGNG RG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGVVYSVGMQIA*KTGGNKQML\CTCLGNGVSCQETAVTOTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSNGALCHFPFLY2WHNYTDCTSEGRR 
DNMKWCGTTQN YDADQKFGFC PMAAHEE I CTTNEGVMYR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DT FH KRHE EGHMLNCTC FGQGRG RW KCD PVDQ CQDS ETGT F YQ1 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TBTPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTI KGLKPG WYEGCLIS IQQYGHQEVTRFDFTTTSTST 
PVTSNT\ VTGETTPFS PLVATSES VTE1TAS S F WS WVS ASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQ I TGYRLT VGLTRRGQ PRQYNVG PS VSKY 
PLRNLQPASEYTVSLVAI KGNQESPKATGVFTTLQPGSS I PPYN 
TEVTETTIVITWTPAPRIGFKIiGVRPSQGGEAPREJVTSDSGSlV 
VSGLT PG VE YVYT I QVLRDGQERDAP \ I VNK\ WTP LS P PTNLH 
LE AN PDTG VLT VS W E R S TTPD ITGYR I TTT PTNGQQGNS LEE W 
HADQSSCTF \DNIgBVPGLSYNVS VYTVKDDKES VPI SDTI I PAV 
P P PT DLR FTN / 1 LGPDTMR VTW \ AP P PS I DLTNFLVR YS P VKNE 
GRMLQSLS I FFLSDN\AVVLTNLLPGTEYVVSVSSVYEQHESTP 

\lrgrqktglds p\tgi dfs\dita\nsft\vhw\ iapra/tp I 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
S IVALNGREES PLLIGQQSTVSDVPRDLEWAATPTSLLI \SWD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 
YTITVYAVTGRGDS PAS SKPISI NYRTEI DKPSQMQVTDVQDNS 
lSVKWLPSSSPVTGYRVTTr\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQ P TVE YWS VYAQNPSGESQ PLVQTAVTNIDRPKGLAFTD V 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREVVPRP 
RPGVTEATI TG LE PGTE YTI Y VI ALKNNQ KS E P L IGR KKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQM I FEEHGFRRTTP PTTATP IRHRPRPYPPNVGQE 
ALSQTT I S WAP FQDTS E Y I ISCHP VGTDE E PLQ FR VPGTS TS AT 
LTGLTRGATYNI I VEAiKDO^RHKVREEVVTVGNS VNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGPKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEA7C YDDG KT YHVGEQWQ KE YLGAI CS CTC FGGQRGWRCDNC R 
RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGAS KSKRQAQQMVQPQS PVAVS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aepartic Acid, B» 
Glutamic Acid r F=*Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S "Serine, TVThreonine, v^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








qskpgcydngkhyqinqqwertylg^ALVctcWgsrgfncesk 
peaeettcfdkytgintrvgdtyerpxe)smiwdctcigagrgris 

CTIANRCHEGGQSYKIGDTWRJIPHETGGYMLECVCLGNGKGEVfT 
CKP I AEKC FDHAAGTS YWGETWE KP YQGWMMVDCTCLG EGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNIjLQCICTGNGRG 

ewkc erhts vqtts sgsg p ftd vraavyqpq ph pqp p p yghcvt 
dsgvvysvgmqia*ktqgnkqml\ctclgngvscqetavtqtyg 
gnsngepcvlpptyngrtfyscttegrqdghlwcsttsnyeqdq 
kys fctdhtvl vqtrggnsngalch f pflynnhnytdctsegrr 

DNMKWCGTTQNYDADQKFGFCPMAAK2EICTTNEGVMYRIGDQW 

dkqhdmghmmrctcvgngrgewtciaysqlrdqcivdditynvn 
dtfhkrheeohmlnctcfgqgrgrwkcdpvdqcqdsetgtfyqi 
gdswekyvhgvryqcycygrgigewhcqplqtypsssgpvevfi 
tetpsqpnshpiqwnapqpshiskyilrwrpknsvgrwkeatip 
ghlnsyti kglxpg wyegqlis iqqyghqevtrfdftttstst 
pvtsnt\vtgettpfs plvatses vteitass fwswvsasdtv 
SG FR VEYELS eegde pqylvlpstats v\ni p\ dllpgrkyi vn 

VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTL3DLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWT PPES AVTGYRVDVI PVNLPGEHGQRLPLSRNT F\ AEN 
TGLS PG VTY Y FX VFA VS HGR ESKPLTAQQTTKL \ DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPAS EYTVSLVA IKGNQESP KATGVFTTLQPGSS IPPYN 
TEVTETTIVITWTPAPRIGFKU3VRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 
LBANPDTG^TVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 
HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDT1IPAV 
P PPTD tiRFTN / 1 LGPDTMR VTW\APPP SI DLTNFL VRYS P V KN E 
GRMLQShS I FFLSDN \AWJL»TNIiLPGTSYWSVSSVYEQHESTP 
\LRGJ? QKTGLDS P \ TGI DFS \ DI TA \ NS FT \ VHW\ I APRA / TP I 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNIjTPGTEYW 
SIVALNGREES PLLIGQQSTVSDVPRDLEWAATP TSLL I \ S WD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 
YTI TVYAVTGR GDS PAS SKP I S INYRTE I DKPS QMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQ PTVE YWS VYAQNPS GESQ PLVQTAVTN I DR P KGloAFTD V 
DVDS IKI AWES PQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTS LSAQWTP PNVQLTG YRVRVTPKEKTGPMKE INLAPDSS 
SVWSGLMVATKYEVSVYALKDTLrSRPAQGVVTTLENVSPpRR 
ARVTDATETT ITI S WRTKTETI TGFQVDAVPANGQTP I QRTI KP 
DVRS YT I TGLQ PGTDY K I YLYTLNDN ARS S PW I DAS TA I DA PS 
NLRFLATTPNSLLVSWQPPRARITGYI I KYEKPGSPPREWPRP 
RrcVTEATITGIiEPGTT5YTTYVTaT.VNMnffQT?DT.TfiP^V*rnt?T r> 

QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQOMIFEEHGPRRTTPPTTATPIRHRPRPYPPiWGQE 
ALSQTT I SWAP FQDTSEY IIS CH P VGTDE E PLQFR VPGTSTS AT 
LTGLTRGATYNI IVEALKDQQRHKVREEWTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKHjCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCIX3NGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
R PGGE PS PEGTTGQS YNQYS QRYHQRTNTNVNC P I EC FMPLD VQ 
ADREDSRE 




8041 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTIiWP 
PPSWRRQPPGGIRRDFSRRLRilEANLVATCLPVRASLPHRLNML 
RGPG PGLLiLLAVLCLGTAVPS TGAS KS KRQAQQMVQPQS P VAVS 
QSKPGCYDNGKHYQINQQWBRTYLGNALVCTCYGGSRGFNCESK 
PEAEBTCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYOGWMMVDGTCLGEGSGR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o rr e spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containina si anal npnri^ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V»Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 








ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 

EWKCERHTSVOTTSSGSGPFTDVRAAVYQPQPHPOPPPYGHCVT 

DSGWYS VGMQLA* KTQGNKQML\CTCLGNGVSCCETAVTQTYG 

GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 

KYS FCTDHTVL VQTRGGNS NGALCHFPFLYNNHNYTDCTS EGRR 

DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 

DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GUSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKBATIP 

GHLNSYTIKGLKPGWYEGQLISIQOYGHQEVTRFDFTTTSTST 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SG FR VE YELS EEGDE PQ YljVLPS TATS V\NI P \ DLLPGR K Y I VN 

VYQISEDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 

PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 

I T I YA VBENQESTP WI QQ BTTG TPR S DTVPS PRDLQ FVE VTD V 

KVTIMWTP PESAVTG YR VDVI PVNLPGEHGQRLPLSRNTF\AEN 

TGLS PGVTYYFKVFAVS HGRESKPLTAQQTTK1»\ DAPTNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

PLRNLQPA5EYTVSLVAI KGMQES PKATGVFTTLQPGSS IPPYN 

TEVTETTIVITT^TPAPRIGFKUSVRPSQGGEAPREVTSDSGSIV 

VSG LT PGVE YVYT IQVLR DGQERDA P \ I VN K \ WT P LS PPTNLH 

LE ANPDTG VLTVS WERS TT PDI TG YR I TTT PTNGQQGNSLEEW 

HADQSSCTF\DNIjEVPGLEYMVSVYTVKDDKBSVPISDTIIPAV 

PPPTDLRFTN/ ILGPDTMRVTW\APPP SIDLTNFLVRYS PVKNE 

GRMLQSLSIFFLSDN\AWLTNLLPGT3YWSVSSVYEQHESTP 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 

TGYR I R\ HH P EH F \ SGR PRE DR \ VFHS RNS I TJLTNI/TPGTEY W 

S 1 VALNGREES PIiLIGOQSTVSDVPRDLEWAATPTSIjIjI \SWD 

APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 

YTITVYAVTGRGDSPASSKPISINYRTEIDKPSQMQVTDVQD>7S 

ISVKWLPSSSPVTGYRVTTT\ P KNG PG \ PTKTKTAG PDQTEM T I 

EGLQ PTVE YWS VYAQN P S GE SQ PL VQTAVTNI DRP KGLAFTD V 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTArPAPTDLKFT 

QVTPTSL3AQWTPPNVQLTGYRVRVTPKEKTGPMKEINIAPDSS 

SVWSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 

ARVTDATBTTI TI S WRTKTET I TGFQVDAVPANGQTPIQRTI KP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 

QLVTLPHPNLHGPE I LDVPS TVQKTPFVTHPGYDTGNGIQLPGT 

SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 

ALSQTT IS WAPFQDTSE YII SCHPVGTDEE PLQFRVPGTSTSAT 

LTGLTRGATYNIIVEALKDQQRHKVREEWTVGNSVWEGLNQPT 

DDSCFDPYTVSKYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

SSRWCHDNGVKYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

R PGGE PS P EGTTGQS YNQYS QR YHQRTNTNVNC P I ECFMPLD VQ 

ADREDSRB 


S3<? " 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSliVLCIPSVPPPVPFPTLWP * 

PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 

RGPGPGLLLLAV LCLGTAVPSTGAS KSKRQAQQMVQPQS PVAVS 

QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 

PEAEETCFDKYTGNTYRVGDTYERPKDSM1WDCTCIGAGRGRIS 

CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 

CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 

I TCTSRNRCNDQDTRTS YR IGDTWSKKDNRGKLLQCI CTGNGRG 

E WKCERHTS VQTTS S GS G PFTD VRAAVYQPQ PH PQPP P YGHCVT 

DSGWYS VGMQ LA * KTQGNKQML\ CTCLGNGVS CQETAVTQTYG 

GNSMGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
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SEQ 
ID 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"' 
<A=Alanine, C=Cysteine, EfeAspartic Acid, E« 
Glutamic Acid. F»Pbenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
l.-Leucine, M=Nethionine, N=rAsparagine, 
P^Proline, a=Glutamine, R*=Arginine, 
S«»Serine, T -Threonine, V-Valine, 
WeTryptophan, Y=Tyrosine, X= Unknown, * ss stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYSFCTDHTVLVQTRGGNSNGALCHFP^LYNNHNYTDCTSEGRR " 

DNM KWCX3TTQNYDADQ KPGFCPMAAH E E 3 CTTNEG VM Y R IGDQ W 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 

DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNS YT I KGLKPG WYEGQL I S I QQYGHQEVTR FDFTTTS TS T 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYELSEEGDEPQYr»VLPSTATSV\NIP\DLLPGRKYlVN 

VYQ I S EDGEQ S L I LS TS QTTAPD APPDPTVDQVDDTS I WRWSR 

PQAPITGYRIVYSPSVEGSSTELNLPSTANSVTIjSDIjQPGVQYN 

ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 

kvtimwtppesavtgyrvdvipvnlpgehgqrlplsrntf\aen 
tgls pgvtyyfkvfavshgres kpltaqqttkl\daptnlqfvn 
etdstvlvrwtppraqitgyrltvgltrrgqprqynvgpsvsky 
plrnlqpaseytvslvaixgnqespkatgvfttlqpgssippyn 
tbvtettivitntpaprigfklgvrpsqggeaprbvtsdsgsiv 
vsgltpgve yvytiqvlrdgqerdap\ i vnk\ wtpls pptnlh 
leanpdtgvltvswersttpditgyritttptngqqgnsleew 
tiadqssctf\dnlevpgleynvsvytvkddkes vpisdti i pav 
ppptdlrftn / 1 lgpdtmrvtwn appps i dltnflvr ys p vkne 
grmlqsls i fflsdn\ awltnll pgte yws vssvyeqhestp 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPBHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 
YTI TVYAVTGRGDS PAS S KP I S I N YRTE I DK PSQMQVTDVQDNS 

isvkwlpssspvtgyrvttt\pkngpg\ptktktagpdotemti 

EGLQPTVE YWS VYAQNPSGESQ PLVQTAVTN I DRPKGLAFTDV 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

elqglrpgseytvswalhddmesqpligtqstaipaptdlkft 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S VWSGLMVATKYEVSVYALKDTLTS rpaqg wttlenvs pprr 
arvtdatettitiswrtktetitgfqvdavpangqtpiqrtikp 
dvrsytitglqpgtdykiylytlndnarsspwidastaidaps 

NLRFLATTPNSIiLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

rpgvteatitglepgteytiyvialkjmnqksepligrkktdelp 
qlvtlphpnlhg p e i ld vpst vqktpfvth pg ydtgng i ql pgt 
sgqqpsvgqqmifeehgfrrttppttatpirhrprpyppnvgqe 
alsqttiswapfqdtseyiischpvgtdeeplqfrvpgtstsat 
ltgltrgatyniivealkdqqrhkvreewtvgnsvneglnqpt 
ddscfdpytvskyavgdewermsesgfkllcqclgfgsghfrcd 
ssrwchdngvnykigekwdrqgengqmmsctclgngkgefkcdp 
heatcyddgktyhvgeqwqkeylgaicsctcfggqrgwrcdncr 
rpggepspegttgqsynqysqryhqrtntnvncpiecfmpldvq 
adredsre 


5367 * 


235 


3591 


kkilnmlckknivieyladilybylygfcfsgikkyliihvlrl " 
I LELWM TRI/LLE KS VS LQTQYLJjIjI vki ls wfpg kemrhhlqim 
evmmrkqds/rivongseqqlqkeladvlmdppmddqpgekelv 
krsqldgegdgplsnqlsasstinpvplvglokpemslpvkpgq 

GDS E AS S P FT P VADEDS WFS KIiTYTjGCAS VN APR S E VE A LRMM 
SILRSQCQISLDVTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LYS FATAFRRS AKQTPLSATAAPQTPDSD I FTFS VSLEI KEDDG 
KG YFSAVPKDKDRQCFKLRQG I DKKI VI YVQQTTNKEIiAI ERCF 
GLLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVITGSWNPKSPH 
FQ WNE ET P KDKVIjFMTTAVDL VI TE VQE PVR FLLETKVRVCS P 
NERLFWPFSKRSTTENFFLKLKQIKQRERKNNTDTLYEWCLBS 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEEEDNDEPLL 
S GSGDVS KECAEK I LETWGELLS KWHLNLNVR PKQLS S L VRNG V 
P EALRGE VWQLLAGCHNNDHL VEKYRI I* I TKE S PQDS A I TRD IN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A= Alanine, OCysteine, D^Aspartic Acid, B=» ! 
Glutamic Acid, P- Phenyl alanine, G«Glycine, 
H«Histidine, I«Isoleucine, K«Lysine, 
L=Leucine, ^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine. R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTFPAHDYFiCDTGGDGQDSLYKICKAYSVYDEEIGYCQGQSFIiA 
/\v uuuinr c cy/\r a v Li v KIMr DYGLR£IjFKQNFEDLHCKFYQLE 
RLMQEYIPDLYNHFLDISLEAHMYASQWFLTLFTAKFPLYMVFH 
IIDLLLCEGIS\aFm^ALGLLKTSKDDLLJ.TDFEGALKFFRVQL 
PKRYRSEENAKFCLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
EDPIERFEREN5RLQEANMRLEQENDDLAHELVTSKIALRKDLD 
NAEEKADALNK3LLMTKQKLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKGISSTKEVLDEDTDEEKETLKNQL 
REMELELAQTKL\QLVEASCKIQD\LEHPF*GIjPFNE\VQAA\K 
KTWFNRTLSSIKTATGVQGKETC 


5368 


573 


2014 


GAAAGAADPRRGSLGGRTMLDFAi FAVT FLLALVGAVL Y L Y PAS 
RQAAG I PG I TP T EEKDGNLPD 1 VNS G S LHE FLVNLH ERYG PWS 
FWFGRRLWSLGTVDVLKQHINP^TLD/LF»NHAEVIIICVSIW 
WWQCE * KP \ Q R KKL YBNG VTDS LKSN FALLLKLP EE LLD K W LS Y 

petqh\vplsqhmlgfamksvtqmvmgstfeddqevirfqknhg 

TVWS E IGKG FLDGSLDKNMTRKKQ Y EDALMQLES VLRN I IKERK 
GRNFSQHIFIDSLVQGNIjNDQQILEDSMIFSIjASCIITAKLCTW 
AIWFLTTSEEVQKFCLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTPVSAQLQDIEGKIDRFIIPRETLVLYALGWLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGF5GTQECPELRFAYMVT 
T VLLS VLVKRLHLLS VEGQVI ETKYELVTS S R E EAW I TVS KRY 


5369 


1 


6622 


PRSLCFSLWAEAAVIaADGGLRRRRRLLRGTMSASFVPNGASLED 

CHCNL FCLADLTG I KWKKYVWQG PTS AP 1 LF P VTEED P I LS S FS 

R CL KADVI/3/ VWRRDQR PERRE \ L * I FWGGEDP \ VLLTL FTMT Y 

QKKKMECGRMDFPMNAVLCFSKAVRNLLERCLMNRNFVRIGKWF 

VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 

LLSEEHITLAQQSNSPFQVILCPFGLNGTLTGQAFKMSDSATKK 

LIGEWKQFYPISCCLKEMSEEKQEDMDWEDDSLAAVEVLVAGVR 

M I Y PACFVL VPQS DI PTPS P VGSTH CS SSC W3VHQ VP ASTR DP A 

MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 

GGKI PRKLANHVVDRVWQECNMNRAQNKRKYSASSGGLCEEATA 

AKVASWDFVEATQRTNCSCLRHKNLKSRNAGQQGQAPSLGQQQQ 

ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 

SQRLV\ISAP\DSQ\VRFSNIR\TNDVAK\TPQMHGTEMANSPQ 

PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 

EDRlDSIiSQSFPPQyQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 

PLKVS DELVQQ YQ I KNQ CLS A IASDAEQEPK IDP YAFVEGDE E F 

LFPDKKDRQNSEREAGKKHKVEDGTSSVTVLSHEEDAMSLFSPS 

IKQDAPRPTSKARPPSTSLIYDSDLAVSYTDLDNLFNSDEDELT 

PGS KRS ANGSDDKASCKE S KTGNLDPLSCI STADLHKMYPTP PS 

LEQKIMGFSPMNMNNKEYGSMDTTPGGTVLEGNSSSIGAQFKIE 

VDEGFCSPKPSEIKDFSYVYKPENCQILVGCSMFAPLKTLPSQY 

LPLIKLPEECIYRQSWTVGKLELLSSGPSMPFIKEGDGSNMDQE 

YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 

RTPRTPRGAGGPASAQGSVKYENSDLYSPASTPSTCRPLNSVEP 

ATVPS I PEAHSLYVNLILSESVMNLFKDCNSDSCCICVCNMNI K 

GADVGV Y I PDPTQEAQYRCTCG FSAVMNRKFGNNSGLFFEDELD 

I IGRNTDCGKEAEKRFEALRATSAEHVNGGLKES EKLSDDL ILL 

LQDQCTNLFSrFGAADQDP FPKSGVI SNWVRVEERDCCNDC YLA 

LEHGRQFMDNMS GG KVDEALV KS S CLH P WS KRND VS MQCSQD I L 

R MLLS LQ PV LQDAI QKKRT VR P WGVQG P LTWQQ FH KMAGRG S YG 

TDESPEPLPIPTFLLGYDYDYLVLSPFALPYWERLMLEPYGSQR 

DIAYWLCPENEALLNGAKS KFRDLTAI YESCRLGQHRPVSRLL 

TDG I MR VGSTAS KKLSEKLVAE WFSQAADGNNEAFS KLKLYAQ V 

CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 

NTPSATLASAASSTMTVTSGVAISTSVATANSTLTTASTSSSSS 

SNLNSGVSSNKLPSFPPPGSMNSNAAGSMSTQANTVQSGQLGGQ 

QTS ALQTAGI SG BSSS LPTQPHPD VS ES TMDRDKVG I PTDGDSH 

AVTYPPAIWYIIDPFTYENTDESTNSSSVWTLGLLRCFLEMVQ 
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ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

co rr e sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G»Glycine, 
K«Histidine, I»Isoleucine, K^Lysine, 
L=> Leucine, M«Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TL P PH I KSTVS VQ 1 1 P CQ Y LLQ P V KH BD RE I YPQHLKS LAFSAF" " 

TQCRRPLPTSTNVKTLTGFX3PGLAt1ETALRSPDRPECIRLYAPP 

FI LA P VKD KQTE LGETFGEAGQKYNVLFVG YCLS HDQ R WI LASC 

TDLYGBLLETCIINIDVPNRARRKKSSARKFGLQKLWBWCLGLV 

QMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLQSLSKRLKDM 

CRMCGI SAADS PS ILS ACLVAME PQGS FVIMPDS VSTGS VFGRS 

TTIiNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 

AFNPNNDGADGMGIFDLLDTGDDLDPDIINILPASPTGSPVHSP 

GSHYPHGGDAGKGQSTDRLLSTEPHEEVPNILQQPLALGYFVST 

AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLHS 

KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 

FWLNQLYNFIMNML 


5370 


1226 


716 


RWSRKLELRRAAQATESRPPQSQEMHPPTGKEVHALKRLRDSAN 
AKDVETV<^LLEDGADPCAADDKGRTALHFASCNGNDQIVQLLL 
DHGADPNQRDGLGNTPLHliAACTNHVPVITTLLRGGARVDALDR 
AGRTPLHLAKSKLNILQEGHAQCLKAVR/HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSLALAESLSLFRACTSLPVG 
GCISWL 


5371 


1331 


167 


IAAMLWKLLLRSQSCRLCSFRKWR5PPKYRPFIACFTYTTDKCS~" 
SKENTRTVEKLYKCSVDIRKIRR\*KDGYF*RMKPMLKKLRI/P 
LQELGADETAVASILERCP EAIVCS PTAVNTQR KLWQLVCKNEE 
EL I KLIEQF PES FFTIKDQSNQKLNVQFFQELGLKNWI SRLLT 
AAPNV FHNP VE KNKQM VR I LQE S Y LDVGGS EANM KVWLL KLLSQ 
NPFILLNSPTAIKETLEFLQEOGFTSFEILQLLSKLKGFLFQLC 
PRSIQNSISFSKNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREGI S I AQI RETPMVLELTPQI VQYRI RKLNSSGYR I KDG 
HLANLNGSKKEPEANFGKIQAKKVRPLFNPVAPLNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME" 
PLRLL I LLF VTELSG AHNTTV FQG VAGQS LQ VS C P YDS M KHWGR 
RKAWCRQLGEKGPCQRWSTHNLWLLSFLRRWNGSTAITDDTLG 
GTLTI TLRNLQPHDAGLYQCQSLHGS EADTLRKVLVEVLAD PLD 
HRDAGDLWFPG\DLRASRM?MWSTASPGASWKEKSPSHPLPSFS 
S W PAS FS SRF* Q PAPSGLQ PGMDRS QGH I HPVNWTVAMTQG I SS 
KLCQG 


5373 


2814 


346 


VKKTKSIFNSAI^IQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKrD 
KTS TTGS I LNLNLDRS KAEMDLKB LS ESVQQQS TP VP LI S P KRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHP I KDKLKGKDETDSPTVHLGLDS DS E\NELVI DLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE\TAP 
AVQ RS CGTS STVQQKE I TQS PS TST IT LVTS TQS S PLVTS SGSM 
STLVS S VNGDLP IGTASADVAADIAKYTS KL\MDAIKGTM\TE I 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LEL TMAEMRQS WEQERDRL I AEVKKQLELEKQQAVDE TKXKG; JfC 
ANFKKEAIFYCCWNTSYCDYPCQ\QAHMPEH\MKSCTQSATAPQ 
XOEADAEVVNTETLNKSSOGSS^^Tn^AP^FTJi^IxV qyfwtgh. 

EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS \TRSDHN/TPSTQHGRJ5LL PG KESRAGTP PLGTS K 


"5374 


2814 ■ 


346 


VKKTKS I FNSAMQEMEVYVENIRRKFGVFN YS PFRTPYTPNSQY 
QMLLDPTN PS AGTAK IDKQEKVKLNFDMTAS PK I LMSKPVLSGG 
TGRR I S LS DM PRS PMSTNS S VHTGS D VEQDAE KKATS S HPS AS E 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
I RS RFQLNLDKT I ESCKAQLG I NE IS ED VYTAVEHSDSBDSEKS 
DSS DS E YISDDEQKS *GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to £ irst 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Minino acia segment containxng signal peptide 
(A*Alanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Slycine, 
K=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, TsThreonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNP VE I KEELKSTS PAS BXADPGAVKDKAS PEPSKDFSG KAKPS 
PHPIKDKLKGKDETDSPTVHLGbDSDSE\NELVIDLGEDHSGRE 
uftRw^^rivr.rai'^i'v vvji\lf f o I i VoonorrSlFVIjTRSSAQ 
TS AAG ATATTS TS STVT VTAPAP AATGS P VKKQRPLLPKE \TAP 
AVQR5CGTSSTVQQKEIT0SPSTSTITLVTSTQSSPLVTSSGSM 
STLVSS VNGDLPIGTAS ADVAADI AKYTS KL\MDAI KGTM\TE I 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDSTKKKQWC 
ANFKKEAIPYCCWNTSYCDYPCQ\OAHWPEH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTIiDLSGSRETPSSIIiU5SNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


j 5375 


2907 


1116 


HIFLAEEEPMLERRCRGPliAMGPAQPRLLSGPSQESPQTliGKES 
RGLRQQGTS VA \ QSGAQAPGRAH RCAHCRRH F PGWVA\ LWLHTR 
RCQA/RGLPLPCPECGRRFRHAP PLALHRQVHAAATPDWG FACH 
LCGQS FRGWVALVLHLRAHSAAKAG P?ACP KMARDA FWR R KAAS 
b fa l LiKRCH PS R P KG PR P F I CGNCGRS I LPTWDQ / LKVAHKRVHV 
S RR P * ERG P P AKVF WG PR P RG P PTGDT PPG PGGD AVDRP F \QCA 
COGKR FRH K\ PNIi I RSHAACTSGER PHQ /CS R ECG \ KRFTNKP Y 
LTS\HRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSE5S 
AQAAPGPGS PQLPAGPQES AAE PTPAVPLKPAQE PPPGAP PEHP 
QDP I EAP PS L YS CDDCGRS FRLERFLRAHQRQHTGERP FTCAEC 
GKNFGKiCTHLVAHSRVHSGERPFRLARKCGRRFLPRASQSGGKN 
SAEPNAPRFGPFVCPDCGKAFRHKPYLAAHRPIATPAEKPYVCP 
DCRKAFSQKSNL\VSHRRIHTGERPYACPDCDRSFSQKSNIilTH 
RKSHIRDGAFCCAICGQTFDDEERLIiAHQKKHDV 


5376 


4504 


591 


VST FS LCLWPAGGGGRGR VSNMAQS KRHV YSRTPSGSRMSAEAS 
AR P IjR VGS R VE V 1 G KG HRGT VA Y VG ATLFATGKW VG VI LD E AfCG 
KNDGTVQGRK YFTCDEGHG I FVRQS Q IQ VFEDGADTTS P E TPDS 
S AS KV LKREGTDTT A KTS KLRGLKP KXAPTAR KTTTRRP KPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPI I PTP 
VLTS PGAVP PUPS PSKEEEGLRAQ VRDLEE KLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQQADIiQRRLKEARKEAKEAL 
EAKERYMEEMADTADAIEMATLDKEMAEERAESLQQEVEALKER 
VDELTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDLSSSEKQEHVK\LQIO J MEKKNQELEVVRQQRERLQEELSQ 
AES TI DELKEQVDAALGAEEMVEMLTDRNLNLEEKVRELRETVG 
DLEAMNEMNDBLQENARETELELREQLDMAGARVREAQKRVEAA 
QETVAD YQQT 1 KKYRQLTAHbQDVNRELTNQQEAS VERQQQPPP 
ETFDFKI KFAETKAHAKAIEMELRQME VAQANRHMS LLTAFMPD 
SFLRPGGDHDCVLVLLLMPRLICKAELIRKQAQEKFELSENCSE 
RPGLRGAAGEQLSFAAIGLVY\SLMPAAGHRYHRY*CHALSQCR 
LD\VYKKVGSLYPEMSAHBRSLDFLIELLHKDQLDETVNVEPLT 
KAIKYYQHLYS IHLAEQPEDCTMQLADHI KFTQSALDCMS VEVG 
RLRAFLQGGQEATDIALLLRDLETSCS \ DIRQFCKKIRRRM PGT 
DAPG1 PAALAPGPQVSDTLIiDCRKHLTWVVAVIiOEVAAAAAQLI 

IS1TWK\LVTA>JQEGEYDAERPPSKPPP\VELRAAALRAEITDA 
EGLGLKLEDRBTVIKBIiKKSLKIKGEEliSEANVRLTLLEKKLDS 
AAKDADE R I E KVOTR IiE ETO AI . I ,T3 K KP KV PP FTM n A T .DA n T nnr. 

EAEKAELKQRLNSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
G AI PGQAPGS VPGPGLVKDS P LLLQQ I S AMRLH I SQ LQHENS I L 
KGAQMKASLASLPPLHVAKLSHEGPGSELPAGALYRKTSQLIjET 
LNQLSTHTHWDITRTSPAAKSPSAQLMEQVAQLKSLSDTVEKL 
KDEVLKETVSQRPGATVPTDFATFPSSAFLRAKBEQQDDTVYMG 
KVTFSCAAGFGORHRLVLTQEQLHQLHSRDIS 


5377 


762 


1106 


DVPCKRVL PAEAQE KGQLTLS CGESGEEG \F* YHEVRQAEG ES * 
/ WFG PMVRLVHTQLKTKK PSG TLKAKFYLHTGSTKFAARIS CT X 
SS*WPG YDGWWGGQYI FI FRGMRWEBQP 


5378 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRAliKPRGRLVLMTSCIiPAL 
RFIATPRbSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spend ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami no acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine/ K=Lysine, 
L^Leucine, M=Nethionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=»Arginine, 
S *Ser ine, T»Threonine, V- Valine, 
W-Tryptophan, Y-Tyroaine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFSFRNSKQTYSG VP 1 I AANMDT VGTFEMAK VLCKS * VPGSFWD 
VPQMGCVFLI YKLFTLKWKMLLLS VLLPAS I LVAEKFSL FTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTrMAGNVVTGEMV 
EEL I LSGADI I KVG IGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HG LKGH 1 1 SDGG CS C PG D VAKAFG AGADF VMLGGM LAGHS ESGG 
ELIERDGKKYKLFYGMSS* I \AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5379" 


2009 


664 


QASGTTLRPLPDLPQLKRRBATSRNRALKPRGRLVLMTSCLPAL 
R F I ATPRLS AM PH I DND VKLD FKD VL LR PKRS TLKS RSE VDLTR 
SFS FRNSKQTYSGVPI IAANMDTVGTFEMAKVLCKS + VPGSFWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPASILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGS SDFEQLEQI LEAI P 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTIMAGNWTGEMV 
EELILSGADI I iCVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHIISDGGCSCPGDVAKAFGAGADFVMLGGMLAGHSESGG 
Eli I ERDG KKYKLFYGMSS * l\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5380 


2 


2050 


PS RAGG AERGRAAAARS PGGS AAGWECPSVLDEAGACTMSSeVS 
SQPSSNRAAPQDELGGRGSSSS ESQKPCEALRG LSS LS IHLGME 
S P I WTECE PGCAVDLG1ARDRP LEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGIiAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDEXGKGSYGWKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI \E0VY0EIA\ ILKKLDHPNW\ Fcr.VFVTA nnPNPnHT.viwnr 

F\ELVNQGPVMEVPTLKPliSEDQARFYFQDI,I KGIEYLHYOKI I 
H\RD I KPSNLLVGEDGHI KIADFG VSNE FKGSDAtiLSNTVGTPA 
FMAPES LS E TRXI F SGKALD VW AMG VTL YC FVFG * C P FMDERI M 
CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIWPEI 
KLHPWVTRHGA2 PL PSEDENCTLVEVTEE EVENS VKHI PSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTICKPTRECESL 
SELKT*K1SPLPACCKVT*EFPHPSGC31PSCWQPPFLHTHSQPR 
♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


S381 


2 


2050 


PS RAGGAERGRAAAARSPGGSAAGWE CPS VLDEAGACTMSS CVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLS1HLGME 
SFIWTECEPGCAVDLGLARDRPLEADGOEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHRVS1TGMQDCVQLNQYTLKDEIGKGSYGVVKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGP I \ EQVYQE r A\ I LKKLDH PNW\ KL VE VL \ D DPNEDHL YMV 
F\ELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKII 
H\RDI KPSNLLVGEDGHI KI ADFGVSNEF KGS DALLSNTVGTPA 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG^CPFMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLXO'RMLDKNPESRIWPEI 
KLHPWVTRHGAEPLPSEDENCTLVEVTEEEVENSVKHIPSLATV 
I LVKTMI RKRS FGNP FEG S RREERS LS APGNLLT KKPTRECE SL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
♦PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPIiPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSHTM . 


5362 


1536 


203 


GARGSQQDA PALQ EAE VRGP ERAQ P ARGRMTKAR L FRLW LVLGS* - 
VFMILLIIVYWDSAGAAHFYLHTSFSRPHTGPPLPrPGPDRDRE 
LTADSDVDEFLDKFLSAGVXQSDLPRKETEQPPAPGSMBESVRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DIPNSELSHLIVDDRHGAIYCTVPKVACTNWKRVMIVLSGSLLH 
RGA P Y RDP LR I P REHVHN AS AHLTFNK FWRRYG KLS RHLM KVKL 
KKYTKFLFVRDPFVRH SAFRSKFELENEEF/ * PQVRRAHAAAV 
RQPHQ PARLGARGLPRWPQ\ VSFANF I QYLLDPHT3KLAPFNEH 
WRQVYRLCHPCQIDYDFVGKLETLDEDAAQLLQLLQVDLAAPLP 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Kethionine, N*Asparagine, 
P«Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan f Y-Tyxosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








PEbPGTGPPSSWEEDWFAKlPLAWRQQLYKLYEADFVLFGyPKP 
ENLLRD 


5383 


45 

• 


5250 


VERLLGCRNS KRTWRMLISKNMPWRRLQG I SFGMYSAEELKKLS 
VKS I TNP RYLD5 IXJNP SANG IjYDLALGP ADS KE VCS TCVQD FS N 
CSGHI^HIELPLTVYNPLLFDKLYIiLLRGSCl^CHMLTCPRAVI 
HLLLCQLRVliE VGALQAVYELERILSRFLEENADPSASE I REEL 
EQYTTEIVQNNLLGSQGAHVKNVCESKSKLIALFWKAHMNAKRC 
PHCKTGRSWRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 
IG KRG YLTPTS AREHLS AIjWKNEGFFIjNYL FS GMDDDGMES RFN 
PSVFFIjDFLWPPSPJSRPVSRLGDQMFTNGQTVNLQAWIKDVVL 
IRKLLAIiMAQEQKLPEEVATPTTDEEKDSLIAIDRSPLSTLPGQ 

slidklyijiwirlqshvnxvfdsemdklmmdkypgirqilekke 
glfrkhmmgkrvdyaarsvicpdmyintneigipmvfatkltyp 
qpvtpwnvqelrqavingpnvhpgasmvinedgsrtalsavdmt 
qre avakqlltpatgap kpqgt ki vcrhvkngd i lllnrqptlh 
rpsiqahraribpeekvlrlhyanckaynadfdgdemnahfpqs 

ELGRAEAYVIiACTDQQYLVPKDGQ PLAGLI QDHM VSGASMTTRG 

CFFTREHYMELVYRGLTDKVGRVKLLS PS ILKPFPLWTGKQWS 

TLLINIIPEDHIPLNLSGKAKITGKAWVKBTPRSVPGFNPDSMC 

ESQVIIREGELLCGVLDKAHYGSSAYGLVKCCYEIYGGETSGKV 

LTCLARLFTAYLQLYRGFTU3VEDILVKPKADVKRQRIIEESTH 

CGPQAVRAALNLPEAAS YD3VRGKWQDAHLGKDQRDFNMI DLKF 

KEEVNHYSNEINKACMPFGIjHRQFPENTLQLMVQSGAKGSTVIJT 

MQISCI.ICQIELEGRSTPLMASGKSLPCFEPYEFTPRAGGFVTG 

RFLTGIKPPEFFFHCMAGREGLVDTAVKTSRSGYLQRCI1KHLE 

GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 

SNYBVIMKSQHLHEVLSRADPKKALHHFRAIKKWQSKHPNTLLR 

RGAFLS YSQKIQEAVKALKLESENRNGR/RPWDS /G/RMLRMWY 

ELDEESRRKYQKKAAACPDPSLSWRPDIYFASVSETFETKVDD 

YSQEWAAQTEKSYEKSELSLDRLRTLLQL\KWQRSLCEPGEAVG 

LLAAQSIGEPSTQMTLOTFHFAGRGEMNVTliGIPRLREILMVAS 

ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 

ESFCMEEKQNKFQVYQLRFOFLPKAYYQQEKCLRPEDILRFMET 

RFFKLLMESIKKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 

EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 

EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 

PSRPPDAAPETHPQPGAPGA\EAMERRVQAVRE i hp fiddyqyd 

T EES LW CQ VTVKL PLMK I NFDMSSLWS LAHGAV I YAT KG I TRC 

IiLNETTNNKNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDIH 

AIANTYGIEAALRVIEKEIKDVFAVYGIAVDPRHLSliVADYMCF 

EGVYKPLNRFGI RSNS SPLQQMTFETS FQFLKQATMLGSHDELR 

S PSACLWGKWRGGTGLFELKQPLR 


5384 


196 


886 


QSCGQRLPTVL*L*GPPGSCPCILSLF\PGRPHALPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALKSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRG I YFFSLNVHS WN YKETY VH IMHNQKEAVI L YAQPS 

ERSIMQSQSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITF 
SGHLIKAEDD 


5385 


32* 


799 

t 


lmvprtkxeapappkaeaxakal\kakkavlkdvkshkknkihm 

SPTFRRPKTL*LRRQPKYPWKSTPRRKKLDHHVIIKFPLTTE*A 
VKKI ENNSLLVFTVDVKANKHQI KQAVKK/ LCDIDVA KVNTL JQ 
SDGERKAYVRLAPDYDALWATKIGIT 


5386 


326 


799 — 


L MV PRTKKE AP A P P KAEAKA KAL \ KAK KAVLKD VHS HKKN KI HM 
S PTFRR P KTL * LRRQ PKYPWKSTPRRNKLDHHVI 1 KFPLTTE * A 
VKKI ENNS LL VFTVD VKANKHQ I KQAVKK/LCDID VAKVNTLIQ 
SDGERKAYVRLAPDYDALWATKIGIT 


53 87 


2 


2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWAIA 
SDDLVFPGFFELWRVLWWIGILTLYLMHRGKLDCAGGALLSSY 
LIVLMILLAWICTVSAIMCVSMRGTICNPGPRKSMSKLLYIRL 
ALFFPEM VWAS LGAA WVADG VQCDRTWNG 1 1 ATVWS W 1 1 1 AA 
TWS II I VFDPLGGKMAP YSSAG PSHLDS HDSSQhLHGLKTAAT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segntent coritainino filarial n^n^ s " 
{A=Alanine, C=Cysteine, D=Aspartic Acid, S- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Hietidine, I=Isoleucine, K^Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=rhreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAFSSTAELPSTYFSDTDLVPSD 
IAAGLALLHQQQDNIRNNQ3PAQWCHAPGSSQEADLDAELKNC 
HHYMQFAAAAYGWPLY I YRNPLTGLCR IGGDCCRS KN PQTMT /M 
VGGDQLQIj/CTSAPILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHRKES VWAVRGTMS LQD VLTDLSAES E VLD VECE VQDRLAH 
KG I SQ AAR YVYQRL INDG I LSQ AFS IAPE YRL V I VGHS LGGG AA 
ALLATMVRAAYPQVRCYAFS PPRGLWS KALQE YSQS F I VSLVLG 
KDVIPRLSVTNLEDLKRRILRWAHCNKPKYKILLHGLWYELFG 
GN PNNLPTELDGGDQEVLTQPLLGEQS LLTRWS PAYS FSS DS PL 
DSSPKYPPLYPPGRIIHLQEEGASGRFGCCSAAHYSAKWSHEAE 

FS K I L I GPKMLTDHMPDI LMRALDS WSDRAACVS CPAQGVS SV 
DVA 


5388 


1569 


753 


TAJDGGAGGGGKRQAGVRRHYLYPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSfSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDT»T^T.NAr:VPMP 

TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5389 


1569 


753 


TADCXSAGGGGRRQAGVRRiiyLVPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGN ?R 
TNGMCS VCYKEHLQRQNSSNG R IS P P VQCTDGS VP 3AQS AL DS T 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSI>E\NRNKKRIAVSCAGRKWDLliGLNAGVEMF 
TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGSFHQQLIjEYK 
ILEHLQTKN 


5390 


217 


1332 


E D P R KLM E DKM WS ECEG P EMS LtVCLTD FQ AHAR EQLS KSTPL5fI 
EGGADDSITRDDNIAAFKRIRLRPRYLRDVSEVDTRTTIQGEEI 
SAP I CI APTGFHCLVWPDGEMSTARAAQAA\G I CYITSTFAS CS 
LEDIVIAAPEGLRWFQLYVHPDLQLNKQLIQRVESLGFKALVIT 
LDTPVCGNRRHDI RNQJjRRNLTLTDLQS PKKGNAI P YFQMTP I S 
TSLCV/NDLSWFQSITRIiPIILKGILTKEDAELAVKHMVQGIIVS 
NHGGRQLDEVLAS IDALTEWAAVKGKI EVYLDGGVRTGNDVLK 
ALALG AKC I FLGDA I L W ALAS KGEHG VKE VLN I LTNE FHTS MA\ 
LTGCRS VAE INRNLVQFSRL 


"5391 


1 


1292 


VKKAAGRSRGPPTAGGQRCEEAPGTV>5ERRLGVRAWVKENRGS F ' 
QPPVCNKLMHQEQLKVMFVGGPNTRKDYHIEEGEEVFYQLEGDM 
VLRVLEQGKHRD W I RQG E I FLLPAR V PHSPQR FANTVGLWE R 
RRLETELDGLR YY VGDTMD VL FEKW F YC KDDGTQLAP 1 1 QE FFS 
SEQYRTGKPIPDQI/LKEPPFPLSTRSIMEPMSIiDAWIiDSHHREIi 
QAGTPLS LFGDTY ETQVIAYGQGS SEGLRQNVD VWLWQLEG S S V 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
D PACKKS P WGEPS CHGLKAATG VPSTLE VP SLPNNS P S PH YLS V 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQrQPTAL 
PVLPG GLPPAPLLP I PLSLQTQ CSTS TPRR PS IKAS 


S392 


1 i 


1623 


i rgsnaqkwgas gsggag pqpdpagpggvpalaaavlgace pr 
caapcplpalsrcrgagsrgsrggrgaagsgdaaaaaew i rkgs 
fihkpahgwlhpdarvlgpgvsyvvrymgcrevlrsmrsldfrrr 
rtqvtreainrlheavpgvrgswkjckapnkalasvlgksnlrfa 
gmsisihi stdglslsvpatrq vi anhhm ps i sfasgg dtdmtd 

YVAYVAKDPINQRACHIIiECCEGL\AQS I ISTVGQAFELRFKQY 

lhsppkvalpperlagpeesawgdeedslehnyyns 1 pgkeppl 
gglvdsrlaltq pcaltaldqg ps psl rdacslp wdvgs tgtap 
pgdgyvqadargppdheehlywtqgldapepedspkkdlfdmr 
pfedalklhecsvaagvtaaplpledqwpspptrrapvapteeq 
lrqe pwyhgrmsrraae rmlradgdflvrds vtnpg q yvltg mh 
agqpkklllvdpegvvrtkdvlfesishlidhhlqngqpivaae 

SELHLRGWSREP 


5393 


2 


982 


GGDSAGMTME1-QMSQNVCPRNLWLLQPLTVL1.HIASADSQAAAP 
PKAVLKI»EPPWINVLQ\EDSVTLTCC2GAPQP/ERSDSIQWFHNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e o ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyateine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H-Histidine, I-Ieoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine. T^ThireoniiR V— v^l i no 
W=Tryptophan, Y=Tyrosine, X=Unknovn, * s Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSIi\SDPVHLTV 
LSEWLVIiO/TPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGN IGYTLFSSKPVTITV 
QVPSMGSSSPMGIIVAWIATAVAAIVAAWALIYCRKKRISAN 

RAPTDDDKNIYliTLPPNDHVNSNN 


5394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLIO^PPWlNVIX)\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMIiRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPC^NHSHSGDYHCTGNIGYTLFSSKPVTITV 
Q V PS MGSS S P MG 1 1 VAW I ATAV AA I VAA WAL I YCR KKR I SAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


5395 


3135 


531 


RAS DAKNQEGLLNTRRKSTDS VP I S KSTLS RSIiSIiQASDFDGAS 
S SGN PEAVAIAPDAYSTGSS SASSTLKRTKKPRP PSLKKKQTTK 
KPTETPPVKBTQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEET PLE P AAG PKAACPLDS ES VEG WP P ASGGGRVQNS PP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKXPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDIPIAKGTYTFDIDKWDDPMFNPFSSTSKMQBSPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETP P V IS AWHATD E EKIiAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
S EA I E I TAPEG S FASADAIiLS RLAHPVSLCGALD YIjEPDLAEKN 
PPLFAQ KLQREAAH PTDVS I S KTAL Y SRI GTAE VE KPAG LL FQQ 
PDLDSALQIARAEIITKERET'/SEWKDICYEESRREVMEMRKIVAE 
YEKTI AQMIEDEQREKS VS\HQTVQQLVIiEKEQA\ LiADLNS VE K 
\ SLADLFRRYE KMKBVLEGFR KNEE VliKRCAQE YLSR VKKEEQR 
YQALKVHA\EE kldranab\ i aqvrgkaqqeqaahqaslaers s 

CRV\DALERTLEQKNKEIEELTKICDELIAKKGKS 


5396 


3135 


531 


RASDAKNQEGLLNTRRKSTDS vpis KSTI»SRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSES VEG WP PASGGGRVQNS PPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPAS? 
PRS PAE PND I P I AXGTYT FD I D KWDD PN FN PFS S TS KMQES P KL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQD? 
TPAAT PET PPV IS AWHATD EEKLAVTNQKWTCMTVD LEAD KQD 

YPOPQFVF«QT , J7Y7WP'PV'J?Q CDTCITT nvSMOVE*yDVMr>VTric(«T rirtrs. 
iruroL/XJOir vtttiM'oorl D£bUiAf«d X Ctlc,XMttl\LGSShPQu 

DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 

ALVNTAAKNQHPVPRGLAPNQESHIiQVPEKSSQKELEAMGLGTP 

SEAIBITAPEGSFASADALLSRLAHPVSLCGALDYLEPDIiAEKN 

PPLFAQKLQREAAHPTDVSI S XTALYSR I GTAEVEKPAGLLFQQ 

PDLDS ALQ IARAE I ITKERBVS EWKDKYEESRREVMEMRKI VAE 

YEKTIAQMIEDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 

\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 

YQALKVHA\EEKLDRANAE\ I AQVRGKAQQEQ AAHQASLAERSS 

CRV\DAIiE RTL EQ KNKE I EE LTKI CDEI* I AKMG KS 


5397 


3135 


531 


RAS DAKNQ EG LLN TRRKSTDS V P I SKS TLS RS LS LQAS D FDG AS " 

SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 

KPTETPPVKETQQEPDEESLVPSGBNLASETKTESAKTEGPSPA 

LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPLTTAPEAGE VTPS DSGGQE DS PAKGHS VR LE FDYS ED KS 

SWDNQQENPPPTKKlGKKPVAKMPIiRRPKMKlCTPEKLDNTPASP 

PR SPAE PND I P IAKGTYTFD 1 DKWDDPNFNP FSSTS KMQESP KL 
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SEQ 
ID 
NO: 


~ Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rL " iiiu a^xu actjuusiiu coacaining signal, peptide 
(A=Alanine, C^cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I«Isoleucine, K=Lysine # 
LaLeucine, M=Methionine, N«-Asparagine, 
P=Proline, Q=.Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=>Tryptopnan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQS YNFDPDTCDES VDP FKTS S KTPS S PSKSPAS FE I PAS AME 
ANGVDGDGIJyJKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
T P AAT P ET P PV I S AWHATD E E KLAVTNQ KWTCMT VD LEAD XQD 
ityruuujir vim i tvt Mr I oauuzKuolaLlLintiKlGSSIjPQD 

DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAI E I TAPEGS FASADALLS RLAHPVSLCGALD YLEPDLABXN 
PPLFAQKLQREAAHPTDVS I S KTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKEREVSEWKDKYEESRREVMEMRKIVAE 

yektiaqmiedeqreksvsxhqtvqqlvlekeoaXladlnsvek 
\ s ladlfrrye kmkevleg frkneev lkrcaqe yls rvkkeeqr 
yqalkvha\bek1.dranae \ iaqvrgkaqqeqaahoastiaerss 
crv\dalertleqknkeieeltkicdeliakmgks 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPSYVFSADPIARPSEINFDGIKLDLS 
HE FSLVAPNTEANS FES KDYLQVCLRI RPFTQS EKELESEGCVH 
ILDSOTWLKEPQCIIiGRLSEKSSG\QM\AQKFSFFPGFLGPAT 
TQKEFFQGClMHP\VKDLLKGQSRIiIFTYGLTNSGKTYTFQGTE 
ENIRILPRTLNVLFDSLQERLYTKMNLKPHRSRBYLRLSSEQEK 
EEIASKSALLROIKEVTVHNDSDDTL.YGSLTNSLNISEFEESIK 
DYEQANLNMANS I KFS VWVS FFE I YNE YI YDLFVPVSS KFQKRK 
MLRLSQDVKG YS FI KDLQWIQVSDS KBAYRLLKLG I KHQSVAFT 
KLNNASSRSHS I PTVKI LQIEDSEMSR VI RVSELSLCDIAGSER 
TMKTQNEGERLRETGNI NTSLLTLGKCINVLKNS EKS KFQQH VP 
FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYDETLNVLKFS 
AIAQKVCVPDTLNSSQEKLFGPVKSSQDVSLDSNSNSKILNVKR 
ATISWENSLEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 
TLEENKAF I SHEEKR KLLDLI EDLKKKLINEKKEKLTLEFK I RE 
EVTQEFTOYWAQREADFKETLLQEREI LEENAERRLAI FKDLVG 
KCDTREEAAKDICATKVETEBATACLELKFNQIKAELAKTKGEL 
I KTKEELKKRENESDSL IQELETSNKKI I TQNQRI KEL INI I DQ 
KEDTINE FQNLKSHMENTFKCNDKADTSSLI INNKLI CNETVEV 
PKDSKSKICS3RKRVNENELQQDEPPAKKGSIHVSSAITEDQKK 
SEEVRPNIAEIEDIRVLQENNEGLRAFLLTIENELKNEKEEKAE 
LNKQIVHFQQ3LSLSEKKNLTLSKEVQQIQSNYDIAIAELHVQK 
S KNQEQEEKIMKLSNEI ETATRS ITNNVSQI KLMHTK I DELRTL 
DSVSQISNIDLLNLRDI>S NGSEEDNLPNTQIiDLLGNDYL VSKQV 
KEYRIQBPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 
Q I E KLQAE VKG YKDENNR LKE KEHKNQDDLLKE KETL I QQLKE E 
LQEKNVTLDVQIQHWEGKRALSELTQGVTCYKAKIKELETILE 
TQKVERSHS AKLEQD I LE KES 1 1 LKLERNLKEFQEHLQDS VKNT 
KJ)LNVKELRLK£EITQLTNNLQDMKHLLQLKEEEEETNRQETEK 
LKEELSASSARTQN\LNADLQRKEEDYADI,KEKLTDAKKQIKQV 
QKEVSVMRDEDKLLRIKINELEKKKNOCSQELDMKQRXTIQQIiK 
EQLINQKVEEAIQQYERACKDLNVKEKI I EDMRMTLEEQEQTQV 
E0D0VL\EAKL5EVERIiATRLDR WP Vif rwnT.PTirMKino cwinTUf 

miTDVLGKLTNLQDELQESEQKYNADRKKWLEEKMMLITQAKEA 
ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 
ERDQLVAALEIQLKAI,I SSNVQKDNE I EQLKRI I SETSKI ETQI 
MDIKPKRISSADPDKLQTEPLSTSFEtSRNKlEDGSWLDSCEV 
STENDQSTRFPKPELEIQFTPLQPNKIvlAVKHPGCTTPVTVKIPK 
ARKRKSNEMEEDLVKCENKKNATPRTNLKFPISDDRNSSVKKEQ 
KVAIRPSSKKTYSLRSQASIIGVNLATKKKEGTLQKFGDFLQHS 
PSILQSKAKKIIETMSSSKLSNVEASKEWSQPKRAKRKLYTSE 
ISSPIDISGQVILMDQKMKESDHQIIKRRIiRTKTAK 


_ 5399 " 


705 


230 


G PRMA KFLSQDQ I NE YKE C FS LYD KQ QRGK I KATDLMVAMR CLG 
ASPTPGEVQRHLQTHGIDGNGELDFSTFLTIMHMQIKQEDPKKE 
ILLAMLMVDKEKKGYVMASDLRS KLTS LGE KI#THKEV\ DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


248 


SHCS S GME I P PTN Y PASRAALVAQN Y I NYQQGTPH RVFE VQ XVK 
QASMEDIPGRGHKYRLKFAVEEIIQKQVKVNCTA2VLYPSTGQE 
TAPEVNFTFEGETGKNPDEEDNTF YQRLKSMKE PLEAQN I \ PDN 
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SEQ 
ID 
NO: 


" Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing Bignal peptide 
(A=Alanine, C=Cysteine, D=Aspartlc Acid, E= 
Glutamic Acid. F=Phenylalanlne, G=Glycine, 
H=Histidine Ialtsoleiifinp Ia/cs^ no 
L^Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutaraine, R=Arginine, 
S=Serine, TVThreonine, V-Valine, 
N= Tryptophan, ^Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








fgnvspemtlvlhijvwvacgyiiwqnstedtwykmvkiotvkqv 
qrnddfi eldyti llhniasqe iipwqmqvlwhpqygtkvkhns 

RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSLAFLAPRDFPFPPKLLIHPQAWRLSCGAGSMGS 
ynnMcpiui "Mon CiVjoooijovji_orio^r nJJUKX Vr ^TWMFSTYFME 
KWAPRQDDMLFYVRRKLAYSGSESGADGRKAAEPEVEVEVYRRD 
SKKLPGLGDPDIDWEESVCLNLILQKLDYMVTCAVCTRADGGDI 
HIHKKKSQQVFASPSKHPMDSKGEESKISYPMIFFMIDSF\BE\ 
VPS DMTVG KG EMVCVE L VAS DKTNTPQG VI FQGS I R Y EALKKVY 
DNRVSVAARMAQK\ MS FGFS KYSNMEF\VR\MKG PQGKGHAEMA 
VS R VSTG DTS PCGTE EDS S PAS PMHERVTS FS TP PTP E RNNRPA 
F FS PSLKRKV PRNR I AE M KKS H S ANDSEEFFREDDGGADLHKAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHLTYVTLPLHRI 
LTD I LE VRQ K P I LMT 


5402 


3445 


1563 


GEC FI MAA WQQNDL VFE FASNVMEDER QhGDPA I FPA VI VEH V 
PGADILNS YAGtAC VEEPNDMI TESSLDVAEEE I IDDDDDDITL 
TVEASCHDGDETI ETI EAAEALLNMDSPGPMIiDEKR INNNI FS S 
PEDDMWAP VTHVS VTLDGI PEVMETQQVQEKYABS PGASS PBQ 
P KR KKGRKTKP PRPDS P ATTPN I S VKKKNKDGKGNT I YLWE FLL 
ALLQDKATCPKYIKWTQREKGI FKLVDS KPVSRLWRKHKNKP\D 
MNYEPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDLIYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVEVAQPS E VLRTVQPTQSPYPTQLFRTVHWQ 
P VQ AVPEGEAARTS TMQDETLNS SVQSIR\TI QAPTQ V P VWS P 
RNQC\ LHTVTLQrVPLTTVI AS TDPSAGTGSQKFILQAI PS SQP 
MTVLKENVMLQSQKAGS PPS IVLGPARV\QQVLTSNVQTICNGT 
VSV\ASSPSFS\ATAPWTLFLLGSSQLVAHPPGTVITSVIKTQ 
ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPNSF 


5403 


3445 


1563 


GECFI MAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADI LNS YAGLACVEE PNDMI TES SLDVAEEE 1 1 DDDDDD I TL 
TVEAS CHDGDETI ETIEAAEALLNMDS PGPMLDEKRINNNI FSS 
PEDDM WAP VTHVS VTLDG I PEVMETQQVQE KYADSPGASS P EQ 
PKRKKGRKTKPPRPDSPATTPNISVKKFCNKDGKGNTIYLWEFLL 
mjijUUAAi^.Fi\.i AJ\WiyKEK.GIr IUjVDS KPVSRLWRKHKNKP\D 
MNYBPMGRALR YY YQRG I LAX VEGQRL VYQ FKEMP KDL I Y INDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLR1VQPTQSPYPTQLFRTVHVVQ 
PVQAVPEGEAARTS TMQDETLNS SVQS I R\TIQAPTOVPVWS P 
RNQQ\ LHTVTLQTVPLTTVI ASTDPS AGTGSQKFI LQAI PS SQP 
MTVLKENVMLQSQKAGSPPSIVLGPARV\QQVLTSNVQTICNGT 
VSV\ ASS PSFS \ ATAP WTLFLLGSSQLVAHPPGTVI TSVI KTQ 
ETKTLTQEVBKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPNSF 


54 04 


187 


1111 


LPVTLI FAKMKTLQSTLLLLLLVPLI KPAPPTQQDSRI I YDYGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI IPNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTGNLIEDIEDGTFSKL 
SLVEELSLAENQLLKL PVLPPKLTLFNAKYNKI KS RGI KANAFK 

KLNNLTFL YLDHN ALE CJV P I ,N L P F r ,BV T HT .OPTON T2VQT mnTP 

CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


2199 


1^20 


QN S RS EHMDPQNQHG SGSSL W I QQ PS LDS R PRLD YE RE IQP TA 
I LS LDQIKA1 RGSNE YTEGPS WKRPAPRTAPRQEKH ERTHE 1 1 
PINVNNNYEHRHTSHLGHAVLPSNARGPILSRSTSTGSAASSGS 
NSSASS EQGLLGRS PPTRPVPGHRSERAIRTOPKQLI VDDLKGS 
LKEDLTQHKFICEQCGKCKCGECTAPRTLPSCLACNRQCLCSAE 

smveygtcmcl\vkgifyhcsnddegdsysdnpcscsqshccsr 

Y LCMGAMS L FL PCLLC Y P P AKG CLKLCRRC YDW IHR PG CRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


279 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMLENY 



313 



WO 01/5331 2 PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corxe spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


• Bb^u o^^iuciiu LuiiLdiiun^ signaji pepcxue 
(A=Alanine, C=Cysteine. D»Aspartic Acid, E= 
Glutamic Acid, P=Phenyl alanine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, G>Glutaminc, R^Arginine, 
S=Se rine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYXNCEHKNVHLKKDHKSVDE 
CKVHRGG YNG FNQCL P ATQ S K I FL FDKCV KA FH K FS N SNRH KIS 
HTEKKLFKC KE CGKS FCMLSH LAQHKI IHTR VNFCKCEKCG KA F 
NCPS I ITKHKRINTGEKPYTCEBCGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSSILTTHKIIRTGEKFYKCKECAKAFNQSS 
NIiTEHKK IH PG EKP YKCE ECGKA FNI^ P STLTKH KR.IHTG E KP YT 
CEECGKAFNQFSNLTTHKR IHTA\EKPYKCTECGEAFSRS \SNL 
TKHKE I HTE KK P Y KCBE CG KAF KWS S KLTEH KiyrHTO H Y P V vr w 
KCGKAFNCPS I ITKHNRINTGE KPYTCBECGKVFNWSSRLTTH K 
KNYTRYKLYKCEECGKAFNKSSILTTHKKIHIEKKFYKCEECGK 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 
GEKPYKCEECGKAFTQSSNLTTHKKIH'ItSEKPYKCEECGKAFrQ 
SSNLTTH KK I HTGGKP YXC EECG K A FWOF QT T.T vmrTTUTccirta 

YKCEECGKAFKWSSTLTKHKIIHTGE3CPYKCBECG\KAFKLSST 
LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 
NK I HTG EKL Y KP ED VTVI LTTPQTFS N I K 


5407 


3 


659 


RPRRRQSSCCTGWLAGWLLRAAPRFCRRTETDMEQGKGLAVLIL 
AI I LLQGTLAQS I KGNHLVKV YD YQE DGS VLLTCDAEAKN I TWF 
KDGKMIGFLTEDKKKV?NLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAEIVS I FDLAVGVYFIAGTGMEFR 
QS \RASDKQTLLP \NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 


5408 


2745 


6128 


QGSKGTCHPQAQQpWDEGVWQEAPSQSEPWGQSQEPPTMPQRIiP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKRBQGSL 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDliDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEKCGB 
VRNKDMS WPEEMS FIANSS KIDRHKVPTEKGATG LS tfLGNTCFM 
NSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLMRVHEKPYVELKJDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
EITVI KLDGTTP VRYGLRLNMDEKYTGLKKQLSDLCGLNSEQI L 
LAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSP 
TQTDFSSS PSTNEMFTLTTNGDLPRPI F I PNGMPNTWPCGTEX 
NFTNGMVNGHMPSLPDSPPTGYIIAVHRKMMRTELYFLSSQKNR 
PS LFGM PLI VPCTVHTRKKDLYDAVW IQVSRLAS PLP PQEASNH 
AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFI GNAYI A VDWHPTALHLR YOTSOF R WDPH P <; Vimc- p d & o 

VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLVm 
LPPILIIHLKRFQFVNGRWIK5QKIVKFPRESFDPSAFLVPRDP 
ALCQH K PLT PQGDELS E PR I LARE VKKVDAQS S AGE ED VLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQ IGSKNKLS S S KENLDAS KENGAGQ I CELADALSRGH 
VLGGSQ PE LVTPQDH EVALANG FLYEH E ACGNGCGNG YSNGQ IjG 
NHSEEDSTDDQR EDTR I KP I YNLYAISCHSGI LGGGHYVTYAKN 
PNCKW YC YNDSS CKELH PDE I DTDS AYI LFYEQQG I D YAQ FL P K 
TDGKKMADTSSMDED FE SDY\ EKYCVLQ 


5409 


2745 


6128 


qgskgtchpqaqqpwdegVWqe'ApsqsbpwgqsqepptmpqrlP - 

HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 
APRPVPASRGGKTLCKGYRQAP PGP PAQFQRP I CSASPPWAS RF 
STPCPGGAVREDT YPVGTQGVPSLALAQGG PQGSWRFLEW KSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRiJKDMSWPEE^FIANSSKIDRHKVPTBKGATGLSNLGNTCFhl 
NSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDKHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHIi 
EITVI KLDGTTP VRYGLRLNMDEKYTGLKKQLSDIiCGIiNSEQIIi 
LAEVHGSNI KNFPQDNQKVRLS VSGFLCAFEI PVP VS PISASSP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino A. i rl SPampnr ("•on t" a ■( « ■! j> 

»v»As* ocvjiuciik ui i l<a in itiy signax pepuxoe 
(A^Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L= Leucine, ^-Methionine, N=»Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S=Serine, T>Threonine, V=Valine f 
"^Tryptophan. Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




* 




TOTDFSSSPSTNEMPTLTTNGDLPRPIFIPNGMPNTWPCGTEK " 
NFTNGMVNGlIMPSLPDSPFTGYIIAVHRKMMRTELyFLSSQKNR 
PSLFGMPLI VPCTVHTRKKDLYDAVW IQVSRliAS PLPPOEASNH 
AQDCDDSMGYQY P FTLRWQ KDGNS CAWCP WYR FCRGC K I DCGB 
DRAF I GNAY I AVDWHPTAliHLRYOTSr)PR\A7hPMi? q vpn c ma an 

VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCI*ATKKLDL»WR 
LPP I L 1 1 HLKRFQFVNGRW I KSQKI VKFPRES FDP SAFLVPRDP 
ALCQHKPLTPQGDELSEPRILARBVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLSSSKENLDASKENGAGQICELADALSRGH 
VLGGSOPELVTPODHEVAliANG FT, YEHParfrhinrr:wr:vcNr , riT 
NHSBEDSTDDQREDTRI KP I YNLYAISCHSGILGGGHY VTYAKN 
PNCKWYCYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFLPK 
TDGKKMALYTSSMDEDFES DY \EKYCVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHL YKLL VIGDLG VG KTS 1 1 KR Y 

VHQNFSSHYRATIGVDFAJJKVLHWDPETVVRLOL.WDIAGOERFG 

NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKLSLPNG 

KPVSV\OjLANKCt)QGKDVLMNNGI.KMDQFCKEHGFVGWFETSAK 
ENINIDEASRCLVKHIIiANFmTJvfPQTPDrwrwvDUT tctvi^^h 

SG\CAKI LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGS FGKPS P VTGLRAARRPJRTR PSAPAAPS VGC 
G KRRES DAG AGGERAS VRTG S GRRGG RTMAGDS EQTLQNHQQ ?N 
GGEP FL I G VSGGTASG KS S VCAK I VQ LLGQNE VDYRQKQ WI LS 
QDSFYRVLTSEQKAKAI/KGQFNFDHPDAFDNELILKTLKEITEG 
KTVQI P VYDFVSHSRKEETVTVYPAD VVLFEGI LAFYSQER / 1 R 
DLFQMKLFVDTDADTRLSRR VLKD I SERGRDLEQI LSSSTLR? V 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\DI 
LNGGPS\NRQTNGCIiNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGI SNF FHKEANFWFE VSG YL I S PLRS P FVDPAIiE WS LMAS P WN 

KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHEXFRDSSLV 

NEQSQ I TRRKKR KKDFQHLI S3 PLKKS R I CDETANATSTLKKRK 

KRRYSALEVDEEAGVTVVLVDKENINNTPKHFRKDVDVVCVDMS 

IEQKLPRK\PKTDKFQVLAKSH\AHKSEALHSKVREKKNKKHQR 

KAASWESQRA\RDTLPQSEFPTQEESWLSVGPGGEITELP\ASA 

HKNKSKKKKKKSSNRE YET\ LAMPEGS QAGREAGTDMQESQ PTV 

GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 

EVGADMQES \RPAVGLHGETAG I PAPAYKNKSKKKKK.KSNHQEF 

EAVAMPESLESAYPEGSQVGSEVGTVEGSTALKGFKESNSTKKK 

S KKRKLTS VKRAR VSGDDFS VP SKNSESTJbFDS VEGDGAMMEEG 

VKSRPRQKKTQACLAS KHVQEAPRLEPANEEHNVETAEDSE I RY 

LS ADSGDADDSDADIiG S AV KOLOF FT P M T vnp at ct t itdmv o r»r\ 

IjERFKE FKAQGVAI KFG KFS VKENKQLEKNVEDFLAliTG ies ad 

KLLYTDRYPEEKSVITNLKRRYSFRLHIGVRNIARPWKLIYYRA 

KKMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 

SVALKFSQISSQRNRGAWSKSETRKLIKAVEEVIIiKKMSPQELK 

EVDSKLQENPESCLSrVREKLYKGISWVEVEAKVQTRNWMQCKS 

KWTEILTKRMTNGRRIYYGMNALRAKVSLIERLYEINVEDTNEI 

DWEDLASAIGDVPP5YVQTKFSRLKAVYVPFWQKKTFPEIIDYL 

YETTLPLLKEKLEKMMEKKGTKIQTPAAPKQVFPFRDIFYYEDD 

SEGGGHRKRKRRPRRHAWFTP V I PVLWEAKAGWI I 


5413 " 


3753 


1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPLLRRTARPGGG 
TPLLNGAG PGAARQS PRS AL FR VGHMSS VKLDDEIiLEP \ DMD P P 
HP FPKE I PHNE KLLSLKYESLDYDNSENQLFLEEERRINHTAFR 
TVEIfCRWVICALIGILTGLVACFIDIWENLAGLKYRVIKGNID 
KFTEKGGLS FSLLL.WATLNAAFVI1VGS VI VAFI EPVAAGSGI PQ 
I KCFLNGVKIPHWRLKTLVI KVSGVILSWGGLAVGKEGPMIH 
SGSVIAAG I SQGRSTSLKRD FKI FE YLRRDTEKRDFVS AGAAAG 
VS AAFGAPVGGVLFSLEEGAS FWKQFLTWRI FFASMISTFTLNF 
VLSrYHGNMWDLSSPGLINFGRFDSSKMAYTIHEIPVFIAMGVV 
GG VLG AVFNALNYWLTMFRIR Y I H RPCLQ V I EAVLVAAVTATVA 
FVLIYSSRDCQPLQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK | 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti<Ie~~ 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


*414 






SWSLFHDPPGSYNPLtLGLFTLVYFFLACWTYGLTVSAGVPlP 
SLL Z GAAWGRL 7G I S LS Y LTG AAI WADPG KYALMGAAAQLGG I V 
RMTLS LTVI MMEATS NVTYGF P I ML VLMTAK I VG DVF I EGLYDM 
HIQLQSVPFLHWEAPVTSHSLTARBVMSTPVTCLRRREICVGVIV 
DVLS DTASNHNG FPWEHADDTQPARLQGIil LRSQL I VLLKHKV 
FVERSNLGLVQRRLRLKDFRDAYPRFPPIQSIHVSQDERECTT/D 
LSBFMNPSPYTVPQEASLPRVPKLFRALGLRHLWVDNRNQWG 
LVTRKDLARYRLGKRGLEELSLAQT 




2130 


390 


GVASAWDRALFSPLLSPTSRVPRTSPPRCVSTETGRRDRARVPS 
QWCS VLQGKLP VSGRTS LAC VRS I LLS PASS PRKVG I VGGTGAR 
AGAAPRDHGRVRHRRPSSARRMTRTTGQCLAPRGCQGPRGTRSP 
RS P RS RTRRGCS AS PACLP*/ CRS ALI VAV LC Y INF tLN YMDRFTV 

AGVLPDIEQFFNIGDSSSGLIQTVFISSYMVLAPVFGYLGDRYN 
RKYLMCGGIAFWSLVTLGSSFIPGEHFWLLLLTRGLVGVGEASY 
STIAPTLI ADLFVADQRSRMLS I FYFAI PVGSGLGY IAGS KVKD 
MAGDWHWALRVTPGLGWAVLLLFLWREPPRGAVERHSDLPPL 
NPTSWWADLRAiARNPSFVLSSLGFTAVAFVTGSLALWAPAFLL 
RSRWLGETPPCLPGDSCSSSDSLIFGHTCLTGVLGVGLGVEI 
SRRLRHSNPRADPLVCATGLLGS AP FLFLSLACARGS 1VATY I F 
IFIGETLLSMNV/ArVADILLYVVIPTRKSTAEAFQIVLSHLLGD 

AGSPYLIGLISDRLRRNWPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


IPPKTKLELQKHXLTTLTNNQEQATIFEEVQKLRPRNEQRENEl, 
IISPLRCLFEBKQKEHIHIGEMKQTSQMAAENIGSELPP5ATRF 
RLDMLKNKAKRSLTESLES ILSRGNKARGLQBHS IS VDLDSSLS 
STLSNTSKEPSVCEKEALPISESSPKLLGSSEDLSSDSESHLPE 
EPAPLSPQOAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLVI ' 
RYHSVSTETPHBK KDFES KANHLGDSGGTPVKTRRHSWRQQI FL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
EKKRTSRELRELWQKAILQQILLLRNEKENQKLQASENDLLNKR 
LKLDYEE I TPCLKEVTTVWEKMLSTPGRSKI KFDMFkmhc; ft \rr n 

GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKEHjKQLT 
S QQHAI L I DLGRT F PTHP Y FS AQLG AGQliS LYN I LKA YSLLDQ3 
VGYCQGLSFVAGILLLHMSEEEAFKMLKFLMFDMGLRKQYRPDM 
IILQIQMYQLSRLLHDYHRDLYNHLEEHEIGPSLYAAPWFLTMF 
AS QFPLG F VAR VFDM I FLQGTEVT FKVALSLLGSHKPL I LQHEN 
LETIVDFIKSTLPNLGLVQMEKTINQVFEMDIAKQLQAYEVEYH 
VLQEELIDSSPLSDNQRMDKLEKTNSSLRKQNLDLLEQLQVANG 
R I QSLE AT IEKLLSSES KLKQAMLTLE LE RS ALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


i7 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK" 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLP P P S PQI* PKHNLHVT KTLMETRR RLEQERATMQMTPGE F 
R R PRLAS FGGMGTTS S LPS FVGSGNHNPAKHQLQNG YQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AI ALKRLKELEEQVRTI P VLQVKI S VLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQS TQR I KE FRQL \ TADMQALEQ KI QDS S CEAS S E LRENGE C 
RS VAVGAE E NMND I WYHRGS RS C KDAAVGTLVEMRNGGVS VTE 
AMLGVMTEADKEIELQQQTIESLKEKIYRLEVQLRETTHDREMT 
KLKQELQAAGS RKKVDKATMAQ PL VFS KWEA WQTRDQMVGS H 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPELPMNWWIVKER 
VEHHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECAS RGVNTEAVS QVE AAV 
MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TS TQTVETR TVA VG EGR VKD INS STKTRS IG VGTLLSGHS G FDR 
PS AVXT KE S G VGQIN I NDNYLVGLKMRTI ACGP PQLTVGLTAS R 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPB 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co r re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seament nnn rainlnrr ri nn ,i nB( .v ^ ja_ 1 — 
(A=Alanine, C=cysteine. D=Aspartic Acid, E= 
Glutamic Acid, ^^Phenylalanine, Glycine, 
H=Histidine, I=Isoleucine, K« Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R-Arginine, 
S- Serine, T-Threonine , V=Valine, 
W= Tryptophan, Y= Tyro sine, X=* Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QE VGTS EG KP I SS LDA F PTQEGTLS P VNLTODQ IAAG L YACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDV1EYPLEBEEEEEDBDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYEJjSEKMLSACNLLKNTIND 
P KALTSKDMR FCLNTLQHEVf FRVS SQKSA I PAMVGDY I AAFEA I 
SPDVLRYVINLADGNGNTALHYSVSHSNFEIVKLLIiDADVCNVD 

hqnkagytpimljuuaaveaekdmriveelfgcgdvnakasqag 
qtalmlavshgridmvkgllacgadvniqddegstalmcasehg 
hvei vklllaqpgcnghledndgstals i al eagh kd i avlil y a 
hvnfakaqspgtprlgrktspgpthrgsfd 


5417 


27 


4074 


KS QLFC FWGG XAGDI LSG DQD KEQ KDP YF VE TPYG YQLDLDFL K 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGBF 
RR PRLAS FGGMGTTSS LPSFVGSGNHNPAKHQLQNGYQGNGDYG 
S YAP AAPTTS SMGS S I RHS P LS SG I S TP VTNVS PMH LQH I REQM 

AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
S0INVCGVRKR3YSAGNASQLEQLSRARRSGGELY2DYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEENMNDIWYHRGSRSCKDAAVGTLVEMRNCGVSVTK 
AMLGVMT EADKE I E LQQQTI ES LKEKI YRLE VQLRETTHDREMT 
KLKQE LQAAG S RKKVDKATMAQ PL VFSKVVE AVVQTRDQMVGS H 
MDLVDTCVGTS VETNS VG ISCQPECKNKWGPELPMNWW I VKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCBTGSNTEESVNDLTLLKT 
NLNLKE VRS IGCGDCS VDVTVCS PKECASRGVNTEAVSQVE AAV 
MAVPRTADQDTS TDLEQVHQFTNTETATLI ES CTNTCLS TLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
P SAVKTKES G VGQIN I NDN YLVGLKMRT I ACG P PQI iTVG LTAS R 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 

ESTLKSIMKKKBGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVSIRERYELSEKMIiSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVS SQKSA I PAMVGDY I AAFEA I 
SPDVLRYVINLAIX5NGNTALHYSVSHSNFEIVKLLLDADVCNVD 
HQNKAGYTPIMIiAAIiAAVEAEKDMRIVEELFGCGDVNAKASQAG 
QTALMLAVSHGRI DMVKGLLACGADVNIQDDEGSTALMCASEHG 
HVB I VKLLLAQPGCNGpLEDNDGS TALS I ALEAGHKDI AVLLYA 
HVNFAKAQSPGTPRLGRKTSPGPTHRGSFD 


5418 


24 


1133 


SVPRAGGDMBTGAAELYDQALLGILQHVGNVQDFLRVLFGFLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
IX3HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EP?I 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQVSVALSSSSIRVAMLEENGERVLM3GKLTHKINTESSLWSL 
E PGKCVLVNLS KVGE YW WNA I LEGE E P I D I DK I NKE RS MATVDE 

EEQAVIjDRLTFDYHQKLQGKPQSHELKVHEMLKKGWDAEGSPFR 
GQRFDPAMFNISPGAVQF 


5419 


1395 


259 


GTHPLDPDLVSRTSVGGPLMTMACPGMSDTEESPFL^PRAAEEG - 
S ES B ACEAFGRRKS E EEGR RS DTSG FGRS RKHKVN W KHPE RADA 
KDPASLPQC/liGP/lXOTPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRIQQWQQSPCIAEBHGKKLLERIRREQQSARTRLQEMERRFH 
ELEAI I LRAKQQAVREDEESNEGDSDDTDLQ I FC VS CGHP INPR 
VALRHMERCYAK YE SQTS FGS M Y PTR I EGATRL FCD VYN PQS KT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRUY CWEKLRRAEVDLERVRVW YKLDELFEQERNVRTAMTN 
RAGLLALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG ' 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
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SBQ 
ID 

NO: 


Pre dieted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Amino acid sesjuent containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R-Arginine, 
S=Serine, T»Threonine, V=Valine, 
W»Tryptophan, Y*Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








eciistllfatlyiix:kifLtrfkkpaeftt\g^vikmppstri7~ 

LLELCTFTLATAT/JZVVT.T T DT?CT T c*TZ»trr t r> t nnvtwr/Mw 
ij*j&uv, x c a. uMinujMv a Li Litr ci± x oMivjjijb IjPRNYY IQWLiNGS 

LIHGLWNLVFLFSNLSL I FLMPFAYFFTESEG FAGSRKG VLGR V 

YBTVVMLMLLTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 

YLYSCISFWVLLLLVCTPLGIiARMFSVTGKLLVKPRIiLEDLEE 

QLYCSAFEEAALTRRICNPTSCWLPLD^IELLHRQVLALQTQRVL 

LE KR RKASAWQRNLG YPLAMLCLLVLTGLS VLI VAIH I LELLID 

EAAM PRGMQGTSLGQVS FS XLGSFGAVIQVVL I F YLMVSS WGF 

YSS PL FRS LR PRWHDTAMTQ I t GNC VCLL VLSS ALP V FS RT LGL 

TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5421 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
ECIISTLLFATLYILCHIFLTRFKKPAEFTT\GMMKMPPSTRL/ 
LLE LCTFTLA I ALG AVL L LP FS 1 1 SNE VLLSLPRN YY IQW LNGS 
LIHGLNNLVFLFSNLS LI FLMPFAYFFTESEG FAGSRKG VLGR V 
YETVVMLMLLTLLVLGMVWVASAIVDKNKANRBSLYDFWEYYLP 
YLYSCISFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QL YCSAFEEAALTRRI CN PTS CWLP LDMELLHRQ VLALQTQR VL 
LEKRRKAS AWQRNLGY PLAMLCLLVLTGLS VL IVAIHILELLID 
EAAM PRGMQGTS LGQVS FS KLGS FG A V I QWL I F YLMVS S WGF 
YSS PLFRSLR PRWHDTAMTQ I IGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5422 


3 


1263 


SCGES LPTWLAGAS RPGIGRKGGAWGGRGGSSPAQ VLLSPGPVF 
KAGCNWWHLSRDQAGVQRCDLGSSQPPPLGFKRFSCLSLPSSWD 
YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGWPPGTQVEQM.YAKKLYDSAF 
H PDTG E KMNV I G RMS FQL PGGM 1 1 TG FMLQ FYRTM P AVI FWQW V 
NQS FNAL VNYTNRNAAS PTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKAP P L VGRW VP FAAVAAANCVN I PMMRQQELI KGICVKDRN 
ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERLEKLH 
FMQKVKVL/ SAPLQVMLSGCFLI FMVPVACGLFPQKCELFVSYL 
EPKLQDTI KAKYOELEPYVYFNKGL 


S423 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDIPGPMSGEQ 
PPRLHAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPCASD 
ANRE P VAERS EPALSGLPPATMGSGDLLLSGESQVEKTKLS S S E 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LS CLS QW KS VLS PGS AAQP S SCSI SAS S TGSSLQGHQERAE PRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAEYWACVLPDSLPPS PDRHSPLWNPNKE YEDLLDYT 

YPLRPGPQLPKKLDS RVPADPVLQDSGVDLDSFS VS PASTLKS P 

TNVSPNCPPARIlTAT.DFCflDJJ j?dct vnuncDtmnvrtnr'»«/-iT 

* m v u j? « \. r rnoH l nxt fc oij r ttr. F a U KU W Fo K V V \j K.Q (aGMGLASW 

SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 

rtrdrgwpsprperekrtsqsarrptctesrwkseeevesddey 
lalparltqvsslvsylgsistlvtlptgdikgqsplbvsdsdg 
pasfpssssqsqlppgaalqgsgdpegqnpcflrsfvrahdsag 
egslgssqalgvssgllktrpslparldrwpfsdpdvegqlprk 
ggeqgkeslvqc\vktfc\cqleelicwlynv\advtdhgtpar 
snltslkXsslqlyrqfkkdidehqsltesvlqkgeillqclle 
ntpvledvi^griakqsgeleshadrlydsilasldmlagctlip 
dkkpmaamehpcegv 


5424 


3186 


905 


G VS MALG EEKAEAE AS E DTKAQS YG RGS CR ER ELD I PG PMSG BQ ' 
P PRLEAEGGL I S P VWGAEGI PAPTCW IGTDPGGPS RAHQ PQAS D 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHOERAEPRG 
GSLAKVSS SLE P WPQE PSS WGLGPR PQWSPQ P VFSGGDtAS GL 
GRRRLS FQAEYWACVLPDSLPPS PDRHSPLWNPNKBYEDLLDYT 
YPLRPGPQLPKKLDSRVPADP VLQDSGVDLDS FSVS PASTLKS P 
TWVS PNC P P AEATALPFSGPRE PS LKQW PS RV PQKQGGMGLAS W 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containinci sianal m»nr-ii4» 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid. F^Phenylalanine, G~Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine r T-Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, ♦=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQLASTPRAPGS RDARWEkREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGW PS PR P EREKRTS QS ARR PTCTBSRWXS EEEVES DDE Y 
LALPARLTQVSSLVSYLGSISTIiVTLPTGDIKGQSPLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 

egsijgssoalgvssgllktrpslparldrwpfsdpdvegqlprk 
ggeqgkbslvqc\vktfc\cqleelicwlynv\advtdhgtpar 

SNLTSLKXSSLQLYRQFKXDIDEHQSLTESVLQKGEILLQCLLE 

ntpvledvlgriakqsgeleshadrlydsilasldmlagctlip 
dkkpmaameh p CEG V 


5425 


1086 


115 


gfcpspslghqpprvlhptmsmavetfgffmatvgllnlgvtlp 
nsywrvstvhgnvittntifenlwfscatdslgvyncwefpsml 

ALSGYIQACRALMITAILLGFLGLLIjGIAGLRCTNIGGLELSRK 
AKIiAATAGAPH\ILPGICGMVAI\SWYAFNITR\DFSDPLYPGT 

kyelgpalylgwsaslisilgglclcsacccgsdedpaasarrp 
yqapvsvmpvatsdqegdssfgkygrnalrvaalcrgprclpta 
pkkrgpgrgpfpysnlrgrprpvpvapprprprvlhshgpsqak 
ncswevaylpseagslif 


5426 


42 


3435 


atssqslgradpprggtmerspgegpspspmdqpsapsdptdqp 

PAAHAKPDPGSGGQPAGPGAAGKALAVLTSFGRRLLVLIPVYLA 

gavglsvgfvlfglalylgwrrvrdekekslraarqllddeeql 

TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRGSNPHLQT ftftr velgekplri igvkvhpgqr 

KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQU1GVLRVIL 

epligdlpfvgavsmffirrptldinivtgmtnlldipglsslsd 

TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGIIRIHL 

laarglsskdkyvkgliegksdpyalvrlgtqtfcsrvldeeln 
pqwgetyevmvhevpgqeievevfdkdpdkddflgrmkldvgkv 

LQASVLDDWFPLQGGQGQVHI.RLBW/^L/jSDAEKLEQVLQWNWG 

vssrpdppsaailvvyldraqdlpmvtselyppqlkkgnkepnp 
mvqls iqdvtqes kavystncpvwe eafrfflqdpqsqbldvqv 
kddsraltlgaltlplarlltapelildqwfqlsssgpnsrlym 
klvmr ilyldsse icfptvpgcpgawdvdsenpqrgss vdappr 
pchttpdsqfgtehvlr i hvleaqdli akdr flgglvkgksdp y 

VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFLGRCJKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 
ERLTPR PTAABLEE VLQ VNSLIQTQKSAELAAALLS I YMERAED 
LPLRKGTKHLSPYATLTVGDSSHKTKTISQTSAPVWDESASFLI 
RKPKTES LELQVRGEGTGVLGSLSL PLS ELLVADQLCLDRW FTL 
SSGQGQVLLRAQLGItiVSQHSGVEAHSHSYSHSSSSLSEBPELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVS1VHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSS Q S LGRADP P RGGTM ER S PGEG P S P S PMDQ PSAPSD PTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTSFGRRLLVLIPVYLA 
GAVGIiSVGFVLFGLALYLGWRRVRDBKERSLRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEX 
LLAETVAPAVRGSNPHLQT FTFTR VELGEKPLRI IGVKVHPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 
TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGIIRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEBLN 
PQWGET Y E VMVHE VPGQEI E VEVFDKDP D KDDFLG RM KLD VG KV 
LQASVLDD WFPLQGGQGQ VHLRLEWLS LLS DAE KL EQ VLQWNWG 
VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
M VQLS I QDVTQES KAVYSTNCP VWEEAFRFFLQDPQSQELDVQV 
KJ3DSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTP DS QFGTBH VLR I HVLE AQDL I A KDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFljGRCKVRiTTVLNSGFLDEWLTLEDVPSGRLHLRL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


oo 3 «-uiio<A j-iiy signajL peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K- Lysine, 
L=Leucine, M=Methicnine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=»Threonine , V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Un*nown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAAifiLEBVI^VNSLIQTQKSAJilJ^AALiLSIYMERAE^ 
L PLRKGTKHLS PYATLTVGDSSHKTKTI SQTS APVWDES AS F L I 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SS GQGQV LLRAQLG I L VS QHS G VE AHSHS YS HS SSSLSEEPELS 
GGPPHITSSAPEV\RQRLTHVDSPL2APAGPLGQVKLTLWYYSE 
B R KL VS I VHG CRSLRQNGR DP PDP YVSLLLLPDKNRGTKRRTS Q 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 
~~5429 


3 


1839 


SSRS3RLSACAIAPPWLVSSRPARPAQLQRPGKMVEDGAEELED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
U^SIPYFHAMFTNDMMECKQDEIVMCCMDPSALBALINFAYNG 
NLAI DQQNVQS LLMGAS FLQLQS I KDACCT FI*R ERLHPKNCLGV 

RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFLALPLEDVLEL 
VSRDELNVKSEEOVFEAATxAWVPYni3PnDr"TT?T V dmt novt^r T 

FCRPQFLSDRVQQDDLVRCCHKCRDLVDEAKDYZjLMPERRPHIjP 
AFRTRPRCCTSIAGLIYAVGGLNSAGDSLNWEVFDP1ANCWER 
CRPMTTARSRVGVAWNGLLYAIGGYDGQLRLSTVQAYNTETDT 
WTRVGSMNS KRS AMGTWLDGQ I Y VCGG YDGNS SLSS V ET Y S PE 
TDKWTWTSMSSNRSAA\GVTVFEGRI YV^nnwnp r .n t cc c\Tt?u 

YNHHTATWHPAAGMLNKRCRHGAASU3SKMFVCGGYDGSGFLS1 
AEMYS S V\ ADQWCL I VPM \ HTRR \ SR VS LGG PAVGRL YAVWG VT 

TGQSNL\SSVGDVLTPETDCWTFM\APMACHEGGVGVGCIPLLT 


5430 


82d 


202 


RREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDIiPPTISLSDGEEPPPYQGPCTLQ 
LRDPEQQLELNRESVRAPPNRTIFDSDLMDSARIX3GPCPPSSNS 
GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 
LEGTR LH HTH I APLES AAI WS KBKDKQKGHPL 




441 


1507 


QKRRKRRRKKIMKTIQP^lHNSISWAIFTGLAALCLFC^VPVRsH 
GDAT FP KAMDNVT VRQGE S ATLRCT I DNR VTR VAW LNRS T I L YA 
GNDXWCLDPR WLLSNTQTQYS I EIQNVDVYDEG PYTCS VQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRHISPKAVGFVSEDEYLEIQGITREQSGDYECSASNDV\A 
A PV\ VRR VKVTVNYPPYlSEAXGTGVPVGQKGTLQCE AS AVPS A 
EFQWYKDDKRLI/EGKKGVKVENRPFLSKLIFFNVSEHDYGNYT 

CVASNKLGHTMASIMLFGPGAVSEVSNGTSRRAGCVWLLPLLVL 
HLLLKF 


5431 
5432 


2 


1312 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWP&cqvdvT — 

LPGITINP\TIAEGPSP\TSEGASEANLVDLOKKLEELELDEQQ 

KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 

GLlMARKLlHLEIKPAIRKQIIRELQVLHECNSPyiVGFYGAFY 

SDGEISICMEHKDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 

YLREKHQlr/IHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANS 

FVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVEltAVGRYPIPPP 

DAKELEAX FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 

Af4AIFELLDYIVNBPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 

DLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5433 ' ~ 


2 


1312 

. 


AAAAPGSRRRRPLPDRPHMAHGYEAPPPPAPRSPAWRapgk-pvV — 
LPGITINP\TIAEGPSP\TSBGASEANLVDLQXKLBELELDEQQ 
KKRLEA FLTQKAKVGELKDDDFERIS ELGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQI I RELQVLHE CNS P Y I VG F YGAF Y 
SDGEISICMEHMDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLA 
YLREKHQ IMHRD VKPSNT LVNS RG E I KLCD FG VS GQL I DS MANS 
F VGTRS YMAP ERLQGTH YS VQS D I WS MGLS LVE LA VG RY P I PP p 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
fVMAlFELLDYlVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLKMLTNHTFIXRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV | 




360 


1885 

3 

1 


aVQEDKVGFEDPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
jFGWPSLVFVFKNEDYFKDLCGPDAGPIGNATGOADCKAQDERF 
SLIFTLGSFMMNFMTFPTGYIFDRFKrTVARLIAIFFYTTArLI | 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide • 
location 
corresponding 
to first 
amino acid . 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«.Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine. R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 

\ s DOSfilbls niifl Prt h i i n oo»»t- i *•«•-. \ 

» i* wo0 ^ nuuieonuc insercioni 








IAFTSAGSAVLLFLAMPMLTIGGILFIilTNLQIGNLFGQHRSTI 
I TLYNGAFDS S SAV FLI I KLL YE KGI SLR / VLLHLHLCLQ YLAC 
STHFPPDAPGAHPIPTAPQLQLWPVPWBWHHKGREX5/QQLSMKT 
GS YSQRS S FQRRKR PQGQGRS RNS APSG ATL/ CSRR FAWHL VWL 

SVTOLWHYLFIGTT.N^T .T.TNMIH^/2nV7\rj170»PV»P\T'7\ trm nmnmi -» 
AVJAlJlVOlJljALNlTMVjVlA*iAKVO I X I N AF AFTQ FGVL 

CA P WNGLLMD RL KQ KYQ KEAR KTG S S TLA VALCS TV?S LALTS L 
LCLGFALCASVPILPLQYLTFILQVISRSFLYGSNAAFLTLAFP 
S EH FGKL PGLVMALSAWSLLQ F P I FTL I KGS LQNDP FYVNVM F 
MLAILLTFFHPFLVYRECRTWKESPSAIA 


5434 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/GSGK 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLBGQVCPACGANLV 
LRQGRFGMFIGCXNYPECEHTBLIDKPDBTAITCPOCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGECPECHYPLLIEKKT 
; AOGVFCHFCASKQCGKPVSAE 


5435 
5436 


4704 


1597 


j PGDSSQRLAEMSNAKERKHAKIWRNQPTNVTLSSGFVAI)RGVKH 
! HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNBQSSSK 
GMFR KKGG W KAGPEGTS Q E I P KY I TASTFAQ ARAAE I S AMLKAV 
TQKSSNSLVFQTLPRHMRRRAMSHNVKRLPRRLQEIAQKEAEICA 
VHQKKEHSKNKCHKARRCHMNRTLEFNRRQKKNIWLETHIWIIAK 
RPHMVKKWGYCLGERPTVKSHRACYRAMTNRCIjLQDLSYYCCIjB 
LKG KEEEI LKALS3MCN I DTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPRBMLGPVTFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGENAKPIKKI IGDGTRDPCLPYSWISPTTGI IISDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DS VSLHCRQEAI FELLGG I TSPAEI PAGTILGLTVGDPRINL PQ 
KKS KALPNP E KCQDN EKVRQLLLEGVP VECTH S F I WNQD I CKS V 
TENKI SDQDLNRMRSELLV PGSQL I LG PHES K I P I LLI QQPG KV 
i\>t.UHUj WGSG WDVLLP KG WGMAFWI P FI YRG VR VG GLKES A VH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDL 
RRS EVP CAPM P KKTHQ PS DE VGTS I EH PREAEE VMDAGCQESAG 
PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 
RGGRRAPGRGQQGLTREACLSILGKFPRALVWVSLSLLSKGSPE 
PHTMICVPAKEDFLQLHEDWHYCGPQESKHSDPFRSKILKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAGQEALTLGLW3GPLPRVTL 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTGLLDMLSSQPAAQ 

RflTiVT.T.PDDIkQT nVDDRDTH TTJil 




1781 


635 


ASDSIPWSEARTTRKLAQRGCQWSLPERMPLWFCGLPYSGKSR 
RAEELR VALAAEGRAVYWDDAA VLGAED PAVYG DS AR E KALRG 
ALRASVERRLSRHDWILDSLNYIKGFRYELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSWNGSAQADVPKELEREBSGAAESPALVTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPLASGSFLHQLDQVTS 
w vuhkjlmh. Ay K£»a VPGDLLTL PGTTEHLR FTR PLTMAELSRLRR 
QFISYTKMHPNNENLPQLANMFLQYLSQSLH 


5437 
" S438 - 


739 


lfi72 


CQEAASEFGGPLH7PAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 

PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 

PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 

WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGWLR 

GEPGAPSRYIXrGPEECIiQISTNLTIJHLLELIASALIiAliCSRPLR 

AALDTLGLRGPLGLWLHGLLSFLAALHGLHAVLSLLTAHPLHFA 

CLFGLLQALVLAVSLREPNGDEAATDV3BSEGLEREGEEQRGDPG 
KGL 




2443 


1152 

i 
J 
1 


TKPRKRRHQPASgRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLALRRRLGQLSC 
viSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
/PTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
3LHHYRNLSEFFRRKLKPQARPV0GLHSVXSPSDGRILNFGQVK 
^CEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, CeCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine , G*Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M*>Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNSLYKCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARWIKELFCHNERWLTGDWKKGFPSLTAVGAT\NWGSIRIY 
FDRDLHTNSPRHSKGSYNDFSFVTHTNREGVPMALRGBHLG /OS 
FNLGST1 VLI FEAPKDFNFQLKTGQKIRFGRAU3S L 


5439 


2443 


1152 


TKPRXRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
IAPPSLRRPMMCQSEARQGPELRAAKWLHFPQLALRRRLGQliSC 
MSRPALKLRSWPliTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VP TRLLSRAWGRLNQVELPHWLRRP VYS b Y I WT FG VNMKEAA VE 
DLHHYRNLSEFFRRKIiKPQARPVCGLHSVISPSDGRILNPGQVK 
KCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
REGNELYHCV I YLAPG0YHCFHS PTDwTVSHRRH FPGSLMS VNP 
. GMARWI KELFCH N E RWLTGDVJKHG F FS LTAVGAT \NWGS I R I Y 
FDRDLHTNSPRHSKGSYNDFSFVTHTNREGVPMALRGEHLG/QS 
FNLGSTIVLIFEAPKDFNFQLKTGQKIRFGEALGSL 


5440 
[ 5441 


*d3 


253 


EPIPVTPDHRLVTMTHIV\QTFSPVNS \GQPPNYEMIjKEEQEVA" " 
MU3APHNPAPPMSTVIHIRSETSVPDHWWSLFNTLFMNTCCLG 

fiafaysvksrdrkmvgdvtgaqayastakclniwalilgifmt 
illiiipvlwqaqr 




2 


2054 


crdggkngfmvspmkplei ktqcsgprmdpkicpadpaffs fin 

NSDliWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIALKIiAEFQTDSQGKIVSTOEKE 
IjVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLL 
PPALFIPSTENEEQ\RI^ARAVPRNVQPYVVYEEVTNVWII^VK 
DIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVIiKSQGYDW 
SEPFSPGEGEQSl*TNAIWVNEETKLVYFQGTKDTPLEHHt»YWS 
YEAAGEIVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPTVLFVYGGPQVQLVNNSFKGIKYLRLNTliASLGY 
AVWI DGRGSCQRGLRFEGALKNQMGQVEI EDQVEGLQFVAEK Y 
GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVMM 
AYDTGYTERYMDVPENNQHGYEAGSVALHVEKLPNEPNRLLILH 
GFLDENVHFFHTNFLVSQLIRAGK PYQLQVALPPVS PQI YPNER 
HSIRCPESGEHYEVTLLHFLQEYL 


5442 


1 


34 74 


CGQRSRRRSPDMPEAKPAAKKAPKGKDAPKGAPKEAPPKEAPAE " 

APKEAPPEDQSPTAEEPTOVFLKKPDSVSVETGKDAVWAKVNG 

KELPDKPTIKWFKGKWLELGSKSGARFSFKESKNSASNVYTVEL 

H IGKVVLGDRGY YRLE VECAKDTCDS CGFNI D VEAPRQDAS GQS L 

E S FKRTS E KKSDTAGE LDFSGLL KKRE WE EE KKKKKKDDD DLG 

IPPEIWELLKGAKKSEYEKIAFQYGITDLRGMLKRLKKAKVEVK 

KSAAJ^*KKLDPAYQVI>RGNKIKLMV3ISDPDLTLKWPKNGQEIK 

PSSKYVFENVGKKRILTINKCTLADDAAYBVAVKDEKCFTELFV 

KEPPVLIVTPLEDQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEIi 

TREDSFKARYRFKKDGKRHILIFSDWQEDRGRYQVITNGGQCE 

AELIVEEKQLEVLQDIADLTVKASEQAVFKCEVSDEKVTGKWYK 

NGVE VR PS KR ITISHVGRFHKLVIDDVR PEDEGDYTFVPDGY AL 

GS LS AKLNFLE I KVE YVPKQ\EP P KI PLGFASGG KTSENAD/ IV 

WAGNKLRLDV\SITGEAPSPFAT\WLKG\DEVFTTTEGRTRIE 

KRVDCSS FVIESAQREDEGRYTIKVTNPIGEDVAS I FLQWD VP 

DPPEAVRITSVGEDMAILVWEPPMYDGGKPVTGYLVERKKKGSQ 

RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSOPSMN 

TKPFMPIA PTS 3PLHLI VED VTDTTT TLKWR P PNR I GAGG I DG Y 

LVEYCLEGSEEWVPANTEPVERCX5FTVKNLPTGARILFRVVGVN 

IAGRS E PATLAQP VTI REIAEPPKIRLP RHLRQTYI RKVGEQLN 

L W P FQG K PR PQ WWTKGGAPLDTSR VHVRTS DFDT VFFVRQAA 

RSDSGEYELSVQIENMKDTATIRIRWEKAGPPINVMVKEVWGT 

NALVEWQAPKDDGNSEIMGYFVQKADKKTMEWFNVYERNRHTSC 

TVS DL I VGNE YY FR VYTENI CGLS DS P GVS KNTAR I LKTG I TFK 

PFEYKEHDFRMAPKFLTPLIDRVWAGYSAALNCAVRGHPKPKV 

VWMKNKME I RED PK FL I TNYQG VLTLK IR RPS P FDAG T YTCRAV 

NELGEALAECKLEVRVpQ 
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SEQ — 

ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreaictefl end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

actjucucc 


Amino acid segment containing signal peptide 
{A^Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, YoTyrosine, X= Unknown, *=Sfcop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5443 


66 


1003 


SRGQLDAGQSS EQHGGNRQPEQSRS RSSSSSSSPRRSRSAAEPA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWL KG W FS VTTVDLKRKP ADLQNLAPGTH PP PI TPNS EVKTDV 
NKIEEFLBEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAVIKNS 
RPEANEAIiERGLLKTLQKLDEYLNS PLPDEI DEN3MEDI KFSTR 
KFLIX^MTLADCNLLPKLHIVKVVAKKYRNFDIPKEMTGIWRY 
LTNAYSRDE FTNTCPS DKEVE I \ AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


SGPIGVTGAQMAKWLRDYLSFGGRRPPPQPPTPDYTESDILRAY 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QSRM PQEDER PADE YDQP WEWKKDHI S RAFAVQ FDS PE WERTPG 
SAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHS GP FPS VP EL VLH YS SR PL P VQG AEH LAL L Y P WTQTP * Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGI*HRERHPEGLP 
RAE KPGLRG P LLG L R E P LG AG PRG P WGLQE PR RCQVW F S Q AP AH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVLLLTWARRRWRETRSRREPT 
TLRAQSVCPWWI*ETRMNRSIPVEVDESEPYPSQLLKPIPEYSP 
EEESEPPAPNIRNMAPNSliSAPTMLHNSSGDFSQAHSTLKCANH 
QRPVSRQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHQFSFMEKRNQWLVSQLSAASPDTGHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLERPLPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLPPNLSPHAPWNYHYHCPGSPDHQVPYGHBYPRAAYQQVIQP 
ALPGQPLPGASVRGLHPVQKVILNYPSPWDQEERPAQRDCSFPG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDI FEDRIRG IDIIKWMBR YLRDKTVMI I VAIS PKYKQ 
DVEGAESQLDED3HGLHTKYIHRMMQIEFIKQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTHVYSWPKNKKNILLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SSWSWCTGRMRKTRLWGLLWMLFVSELRAATKLTEEKYELKEGQ 
TLDVKCDYTLEKFASSQKAWQI I RDGEMPKTLACTERPS KNSH P 
VQVGRIILEDYHDHGLLRVRMVNLQVEDSGIiYQCVIYQPPKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYKI P PTTTKAL CPLYT 
TPRTVTQAPPKSTADVSTPDSEINLTNVTDIIRVPVFNIVILLA 
GGFLS KS LVPSVLFAVTLRS FVP * AHEPTRMSS DFQPHPSGS CA 
KGGGRR 


5447 


207 


617 


WARTLSLMASLVAYDDSDSEAETEHAGSFNATGQQKDTSGVAR 
PPGQDFASGTLDVPKAGAQPTKHGS CEDPGG YRIiPLAQLGRS DR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCWPY 
TPRRLRQRQALS TETGKG KD VE PQGPPAGRAPAPL YVG PG VS E F 
IQP YLNS HYKETT VPRX VL FHLRGHRGPVNTIQWC P VLS KSHML 
LSTSMDKTFKVWNAVDSGHCLQTYSLHTEAVRAARWAPCGRRIL 
SGGFDFAUJLTDLETGTQLFSGRSDFRITTLKFHPXDHNIFLCG 
GFSSBMKAWDIRTGKVMRSYKATIQQTJjDILFLREGSEFLSSTD 
ASTRDSADRTI 1 AWDFRTSAKISNQI FHERFTCPSLALHPRE PV 
FLAQTNGNYLALFSTVWP YRMSRRRR YEGHKVFC5 YQVfirpr c on 
GDLLVTGS ADGRVLMYS P RTAS RACTLQGHTQ AC VGTT YH P V LP 
SVLATCSWGGDMKZWH*AFHWLSLGEAIGDLAPARGYSGPGRSL 
KSPSPS KS LLVLLCGRAM FQPATCPWQLPALS K 


5448 


194 


1833 


KASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRNKPKKTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKG I VRWFFP F FF 
RWWI^VTSKVIFFWLLVLYLLQVAAIVIiFCSTSSPHSIPLTBVI 
G P I WLMLLLGTVHCQ I VSTRTPKPP LS TGGKRR R KLRKAAHLE V 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNSIDKS TETDNG Y VS LDGKKT VKSGEDG IQNHEPQCET 



323 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


r i. CU1LLCU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


preciccea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R*Arginine, 
S^Serine, T= Threonine, VsValine, 
"tryptophan, Y«Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTCTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDS ESAR PES ETED VLWEDLLHCAECHS SCTS ETD VENHQ INPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIWEGNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGZjTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVI ISFWRVSLVWI 
FFFLLCVAERTYKQVGIM'TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSS RQDSES ARPES ETEDVLWEDL LHCAECHS S CT 
SBTDVENHQ I N P CV KKE Y RDDP FHQS HLPWLHS SHPGLE KI SAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIFGNAVSLI 
I/3LTPFVFRLSQATDLEQI>TAHSASELYVIAFGSNEDVIVLSMV 
I IS FWR VSLVWI FFFLLCVAERTYKQVGIM 


5449 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRNKPKKTA 
HVKPDLIDVDLVRGS AFAKAKPES PWTS LTTKG I VR WFFPFFF 
RWWLQ VTS KV I FF WLLVL YL IjQ VAAI VL FCSTSS PHSIPLTE VI 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GS KKAKNS I DKS TETDNG Y VS LDG KKTV KSGE DG I QNHE P QC ST 
IRPEETAWNTG TLRNGPS KDTQRTI TNVSDEVSS EEGPETG YS L 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSBTDVENHQINPC ' 
VKKEYRDDPFHQSHLPWLHSSHPGLEKrSAIVWEGNDCKKADMS 
VLEISGMIMNRVTJSHIPGIGYOIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSAS EL YVI AFGSNEDVI VLSMVI I S FWRVS LVWI 
FFFLLCVAERTYKQVGIM *TSEGVLRNRKSHHYKKHYPNEDA?K 
SGTSCS 3 RCSS S RQDSES AR PE S ETEDVLWEDL LHCAECHSS CT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGNDCKKADMS VLB I SGM I MNRVNS H I PG IG YQ I FGN A VS L I 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNSDVIVLSMV 
1 1 SFWR VSLVWI FFFLLCVAERTYKQVGIM 


5450 


' ""■ 8136 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLLLAAG 
PADHLLLQLYSGHLQ VRLVLGQEELRLQTPAETLLSDS I PHT W 
LTWEGWATLSVDGFLNASSAVPGAPLEVPYGLFVGGTGTLGLP 
YLRGTSRPLRGCLHAATLNGRSLLRPLTPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 
RRGDF I Y VDI FEGHLRAWEKGQGTVLLHNS VP VADGQPHE VS V 
HINAHRLEISVDQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRLGLTPEATNASLLGCMEDLSVNGQRRGLREALLTRNMA 
AGCRLE E E E YEDDAYGHYE AFS TLAPB AW P AMEL PE PC VPE PGL 
PPVFANFTQLLTISPLWAEGGTAWLEWRHVQPTLDLMEAELRK 
SQVLPSVTRGAHYGELELDILGAQARKMFTLLDWNRKARFIHD 
GSEDTSDQLVLEVSVTARVPMPSCLRRGQTYLLPIQVNPVNDP? 
HI I FPHGS LMVILEHTQKPLGPE VFQAYD PDSACEGLTFQVLGT 
SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 
VSDGLQAS PPATLKWAIRPAIQIHRSTGLRLAQGSAMP ILPAN 
LSVETNAVGQDVS VL FR VTGALQ FGELQKHS TGGVEGAE WWATQ 
AFHQRDVEQGRVRYLSTDPQHHAYDTVENLALEVQVGQEILSNL 
S FPVT I QRAT VWMLRLE PLHTQNrQQETLTTAHLEATLEEAGPS 
PPTFH YE WQA P R KGNLQLQGTRLSDGQG FTQDDI QAGRVT YGA 
TARASEAVEDrFRFRVTAPPYFSPLYI'FPIHIGGDPDAPVLTNV 
LLWPEGG EG VLS ADHL FVKS LNSAS YL YE VMER PRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDDI PFVATRQGE 
SSGDMAWEEVRGVFRVAIQPVNDHAPVQTISRIFHVARGGRRLL 
TTDDVAFS DADSGFADAQLVLTRKDLLFGS IVAVDEPTRPI YR F 
TQEDLR KRR VLFVHS GADRGW IQLQ VSDGQHQATALLEVQASE P 
YLRVANGS SLWPQGGQGTIDTAVLHLDTNLDI RSGDEVHYH VT 
AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGSLSPEDTMAF 
S V£ AGP VHTDATLQ VT I ALEG P LAP LK L VRHKfCI YVFQG EAAE I 
RRDQLEAAQEAVPPAD I VFS VKS PPS AG YLVM VSRGALADE PPS 
LD P VQS FS QE AVDTGRV LYLHS R PEAWS DAFS LDVASG LG APLE 
GVLVBLEVLPAAI PLEAQNFS VPEGGSLTLAP PLLRVSGP YFPT 
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SEQ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=Histidine, I-Isoleucine, K« Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P- Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
K=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLS LQVLEP PQHG PLQK5DG PQARTLS AFS WRMVEEQLI R YV 
HDGSETLTDSFVLMANASEMDRQSHPVAFTVTVLPVNDQPPILT 
TNTGLOMWEGATAPIPAEALRSTDGDSGSEDLVYTIEQPSNGRV 
VLRGAPGTEVRSFTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLI.LYRWRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GN I LYEHEM P PE P FWEAHDTLE LQ LS S PPARD VAATLAVAVS PE 
AACPQRPSHLWKNKGLWVPEGQRARITVAALDASNLLAS VPS PQ 
RSEHDVLFQVTQFPSRGQLbVSEEPLHAGQPHFLQSQIiAAGQLV 
Y AHGGGGTQQDG FH FRAHLQGPAGAS VAG PQTS E AFAIT VRDVN 
ERP PQPQAS V PLR LTRG S RA PISRAQLS WDPDS APGEI E YE VQ 
RAPHNGFLSLVGGGLG PVTRFTQADVDSGRLAFVANGSS VAGI F 
QLS MS DG AS P P L PMSLAVD I LPS A I E VQLRAPL E VPQ ALGR SSL 
SQQQLRWSDREEPEAAYRLIQGPQYGHLLVGGRPTSAFSQFQI 
IXXSEWFAFTNFSSSHDHFRVIAl^GVmSAVVNVTVRALLHV 
WAGGP WPQGATLRLDPTVLDAGELANRTGS VPR FRLLEG PRHGR 
WRVPRARTEPGGSQLVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDS LTLELW AQG VP PAVAS LDF ATE P YNAARP YS VALL S V P EA 
ARTEAGKPESSTPTGEPGPMASSPEPAVAKGGFLSFLEANMFSV 
I I PMCLVLLLLAL I LPLLFYLRKRNKTGKHDVQVLTAKPRNGLA 
GDTETFRKVEPGQAIPLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


54S1 


1 


2274 


RDS S EQGRTGDTLG RP SACM DALKPP CLWRN HE RG KKDRDS CGR 
KNSEPGSPHSLEALRDAAPSQGLNPLLLFTKMLFIFNFLFSPLP 
TPALI CI LT FGAAI FLWL I TRPQP VLPLLDLNNQS VG IEGGAR K 
GVSQKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNQ 
P YRWLS YKQVSDRAE YLGSCLLHKG YKS S PDQFVG I FAQNRPE W 
IISELACYTYSMVAVPLYDTLGPEAI VHIVNKADIAMVICDTPQ 
KALVLIGNVEKGFTPSLKVIILMDPFDDDLKQRGEKSGIEILSL 
YDAENLGKEHFRKPVPPSP ED LS VI CFTSGTTGDP KG AM I THQN 
I VSNAAAF LKCVEHAYEPT P DDVAI S YL PLAHM FE RI VQAWYS 
CGARVGFFQGDIRLLADDMKTLKPTLFPAVPRLLNRIYDKVQNE 
AKT PLKK FLLKLAVSS KFKELQKG 1 1 RHDS FWDKL I FAKI QDS h 
GGR VRV I VTG AAPM STS VMT FFRAAMG CQVYEAYGQTECTGG CT 
FTLPGDWTSGHVGVPLACNYVKLEDVAIJMNYFTVNNEGE VCI KG 
TWFKGYLKDPEKTQBAI.DSDGWLHTGDIGRWLPNGTLKIIDRK 
KNIt'lUjAQGEYIAPEKIENIYNRSQPVliQIFVHGESLRSSLVGV 
WPDTDVLPSFAAKLG VKGS FEELCQNQ WREAI LEDLQKIGKE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


5452 


1833 


113 8 


SRVPSLCLSLSLSLSPSRBPVAGAPGCGTAGPPAMATLWGGLLR 
LGSLLSLSCLALSVLLLAQLSDAAKNFEDVRCKC I CPP YKENSG 
HIYNKNISQKDCDCLHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTI 1 1 YX»SILGLLLI*YMVYLTLVEPILKRRLFGHAQLIQS 
DDDIGDHQPFANAHDVLARSRS RANVLNKVE YAQQRW KLQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 


PSIPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGEQAV 
AG PAPS TVPSSTS KDR P VS Q P S LVGS KEE PP P AR S G SGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL 
RS WCRQ I L KG LQ FLHTRTPP I IHRDLKCDNI F r TG PTGS VKIGD 
IiGIiATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EG CIRQNKDER YS I KDL LNHA FFQEETG VR VELABEDDG EK I AI 
KLWLRI EDI KKLKGKYKDNEAI EFS FDLERNV PEDVAQEMVESG 
YVCEGDHKTMAKAIKDRVSLI KRKREQRQL* 


""^454 


111 - 


1520 


PSIPAAVPQSAPPEPHRBETVTATATSQVAQQPPAAAAPGEQAV 
AG PAPS TVPS STS KDRPVSQPS LVGS KEE PPPARSGSGGGSAKE 
PQEERSQQQDDIEBLETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KG LDTETT VE VAWCELQDRKLTKSERQRFKEEAEMLKGLQH PN I 
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SEQ 
ID 

NO: 


Predicced 
beginning 
nucleotide 
location 
correspondi ng 
to first 
aroir.o acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
c or r e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, S= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H^Histidine. I»Isoleucine, K-Lyeine, 
L»Leucine, M=Methionine, tf^Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
n= * *• yf*» u i'* it * i * » i 3 iytosine, A=unxnown, * =s top 
Codon, /=possible nucleotide deletion, 
\»p093ible nucleotide insertion) 








VRF YDS WESTVKGKKC I VLVTELMTSGTLKT YLKR FKVMKI KVL 
RSWCRQILKGLQFLHTRTPPI IHRDLKCDNI FITGPTGSVKIGD 
LGLATLKRAS FAKS VIGTPEFMAPEMYEE KYDES VDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EGCIRQNKDER YS I KDLLNHAFFQEETG VR VELAEEDDGEKIA I 
iNuwuRitui mujMjiv j csAJc*EJ\±t,r irDL»i£RNVPEDVAQEMVESG 
YVCEGDHKTMAKAIKDR VSLIKRKREQRQL * 


5455 


1359 


377 


LTMVS PATRKS LPKVKAMDFITSTAI LPLLFGCLGVFGL.FRLLQ 
WVRG KA YLRN AVW I TGATSGLGKECAKVFYAAG AXL V LCG RNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
• QC FG YVD I LVNNAG I S Y RGT I MDTTVDVD KRVME TNY FG P VALT 
KALLPSMIiCRROGHIVAlSSIC?GKMSIPFRSAYA.ASKHATQAF , F 
DCLRAEMEQYE I EVTV I S PGYIHTNLS VNAITADGSR YGVMDTT 
TAQGRS P VE VAQD VLAAVGKKKKD VI IADLL PS LAVYLRTLAPG 
LFFS LMASRARKERKS KNS 


5456 


2 


2332 


CGAGLVAAGAVLVLY PAS RAGERTRVP3S P APSS LPLHS PGACG " 
TEVDMDPQRSPLLEVKGNIELKRPLIXAPSQLPLSGSRLKRRPD 
QM EDGLE PEKKRTRGLG ATT KI TTSH PR VPSIiTT VPQTQGQTT A 
QKVS KKTGPR CS TA IATGIjXNQKP VPAVP VQKSG TSGVP PMAGG 

kkpskrpawdlkgqlcdlnablkrcrertqtldqenqqlqdqlr 
daoxwvkau;terttleghi»akvqaqaeqggqelknlracvlel 

EBRLSTQEGLVQELQKKQVEU3EERRGLMSQLEEKERRLQTSEA 
ALS S S QAEVAS LRQETVAQAALLT EREERLHGLEMERR RLHNQL 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEBIA 
MIiVQSALDGYPVCIFAYGQTGSGKTFTMSGGPGGDPQLEGLI PR 
ALRHLFS VAQELSGQGWTYS FVASY VE I YNBTVRDLLATGTRKG 
0GGECEIRRAGPGSEELTVT^ARYVPVSCEKEVDAl4LHl»ARQ>m 
AVARTAQNERS S R SHS VFQLQ I SG EHSS RG LQ CG APLS LVDLAG 
SERLDPGLALGPGERERLRETQAINSSLSTLGLVIMALSNKESH 
VPYRNSKLTYLLQNSLGGSAKMLMF VNI SPLEENVSESLNS LRF 
ASKVEPSVLTOTAQSNRKVJKTDPDLCVCVCVCVCVCVCVCVCVP 
MSM YRVRGGR VAGGCF IG WRAPCP RA Z K 


5457 


2 


1540 


DDFVERRRWTRTTCLVRSPPHVPVCGHACSWNGGSLDPLKGTPA 
LLRSAERLMRKVKKLRLDKENTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRS I IHGSRXYSGLI VNK 
APHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLIjYSEIPKKV 
RKEALLLLSWKQMIiDHFQATPHHGVYSREEELLRERKRjjGVFGI 
TSYDFHSESGLFLFQASNSIjFHCRDGGKNGFMVSPGPGCVSPMK 
PLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKTLRI LYEEVDES E VE V I HVPS PALEERKTDS YRY PRT 
GSKKPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAM FLDR PQQ WLQLVLLP PALF I PSTENEEQA 
ASLCQS CPQECPAVCG VRGGKQRLDQCS 


5458 


6642 


4022 


FVPGLRE PQWE PAQPS ATMS APS EEEE YARLVMEAQP EWLRAE V 
KRLSHELAETTREKIQAAEYGIiAVIiEBKHQLKLQFEELEVDYEA 
I RSEMEQLKEAFGQAHTNHKKVAADG ES REES LI QES AS KEQYY 
VRKVLELQTELKQLRNVLTNTQSENERLASVAQELKEINQNVEI 
QRGRLRDD I KEYKFREARLLQDYSELEEENIS LQKQVS VLRQNQ 
VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKElSERQIiEEAIiET 
LKTEREQKNSLRKELSHYMSINDSFYTSHLHVSLDGIjKFSDDAA 
EPNNDABALVNG FEHGGLAKLP LDNKTSTPKKEGLAP PS PS LVS 
DL£*SELNISEIQKLKQQtiMQMEREKAGIiLATIjQDTQKQLEHTRG 
SLSEQQEKVTRLTENLSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVDINGPEILACKYHVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEKVSLLEKASRQDREI.LARLEKELKKVS 
DVAGETQGSLSVAQDELVTFSEEliANLYHHVCMCNNETPNRVML 
D YYREGQGGAGRTS PG GRTS PEARGRRSP ILLPKGLLAPEAGRA 
DGGTGDSSPSPGSSLPSPLSDPRREPKNIYNL1AIIRDQIKHLQ 
AAVDRTTELSRQRIASQELGPAVDKDKEALtlEEILKLKSIiLSTK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to fir6t 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hietidine, I^Isoleucine, K= Lysine, 
L=Leucine. M=Me thion irte* M-lcnarait<no 
P=Proline, Q=Glutamine, R=Arginine, 
S=S erine , T-Threonine , V«Val ine , 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQITTLRTVXjKANKQTAETVALAfJLKSKYENEKAMVTETMMKljR 
NE LKALKJSDAATFSS LRAMFATRCDEY I TQLDEMQRQLAAAEDE 

PATPSVSHTCACASDRA EGTGLANQVFCS JSKHS I YCD 


5459 


316 


1262 


RCJGHRLSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS 
KGPKRI^KFSDERAAYFRCYHKVTEIjNNVraiVARLPKSTKKHAI 
G I YFNDDTS KT FACES DLEADEWCKVLQMECVGTRINDI S LGE P 
DLLATGVEREQSERFNVYLMPSPNLGCYMGECALQITYEYICLW 
DVQNPRVKLISWPLSALRRYGRDTTWFTFEAGRMCETGEGLFIF 
QTRDGEA1YQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 

SLSTMVPLPRSAYWQHITRQHSTGQLYRLQDVSSPLKLHRTETF 
PAYRSEH 


5460 


45 


209? 


RPGCRAGELSTGSRARERVRNRVSAPCGQDSRRCDPEVLRGRSP 
GLGLAEMPSCGACTCGAAAVRLITSSLASAQRGISGGRIHMSVL 
GR LGT FETQ I LQRA PLRS PTET P A Y FAS KDG I S KDG S G DGNKKS 
ASEGSSKKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CEKCHHFFVVLSEADS KKSI I KEPESAAEAVKIiAFQQKPPPPPK 
KI YNYLDKYWGOS FAKKVLS VAVYNHYKR I YNNIPANLRQQAE 
VEKQTSLTPRELEI RRREDEYRFT KLLQ IAGIS PHGNALGASMQ 
0Q VNQQI PQE KRGGE VLDSSHDD I KLEKSN I LLLGPTGS G KTLL 
AQTLAKCLDVP FAI CDCT TLTQAG YVGEDI ESVI AKLLQDANYN 
VE KAQQG I VF LD E VD K I GS VPG I HQLRDVGGEG VQQG LL KLLEG 
TI VNV PE KNS RKLRG ETVQ VDTTN I L FVAS G AFNGLDR 1 1 S RR K 
NEKYLGFOTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRIj 
LRHVE ARDLI EFGMI PE FVGRLPWVPLHSLDEKTLVQI LTEPR 
NAVIPQYQALFSMDKCELNVTEDAliKAIARIALERKTGARGLRS 
IMBKLLLEPMFEVPNSDIVCVEVDKEWEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5451 


1481 


i fin 


INFPPPPKSyuWKAKKWKKRKRPGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFliLLVLLLYAPVGFCLliVLRLFLG I HVFLVS CALPD 
SVLRR FWRTMCAVLGLVARQEDSGLRDHS VRVLI SNHVTPFDH 
NIVNLLTTCSTPLLNSPPSFVCWSRGFMEMNGRGELVESLKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
QVQRPLVS VTVSDASWVS ELLWSL F VP FTVYQVRWLRPVHRQLiG 
EANBE FALRVQQLVAKEIiGQTGTRLTPADKAEHMKRQRH PRLRP 
QS AQS S FPPS PCfPS PD VQLATLAQR VKE VL PHV PLG VI QRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPSSGPV 
TPQPTALTFAKSSWARQESLQERKQALYEYARRRFTERRAQEAD 


5462 


663 


3353 


KIKERQMSANKSPPSAQKSVLPTAIPAVLPAASPCSSPKTGLSA" 
RLSNGS FSAPSLTNSRGS VHTVS FLLQIGLTRES VTI EAQELS L 
S AVKDL VC S JVYQKFPECGFFGMYDKI LLFRHDMNS ENI LOL I T 
SADE I HEGDLVE WLSALAT VEDFQ I RPHTLYVHSYKAPTFCDY 
CGEMLWGLVRQGhKCEGCGLNYHKRCA FKlPNNCSGVRKRRLShl 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQEPSKRIPSWSGRPI 
WMEKMVMCRVKVPHTFAVHSYTRPTICQYCKRLIiKGLFRQGMQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
SPSTSNNIPI^WQSIKHTKRXSSTMVKEGWMVHYTSRDNLRK 

RHYWRLDSKCLTLFQNESGSKYYKEIPLSEILRISSPRDFTNIS 

OGSNPHC*FPTTTr)TM\/VC\7r:ti*KnvTr'ncciJ\mirr AitTmrrr ntr«<sn 
uugivrn^r r>± x li/inv x r vvjtiviNJtjlJooJTNt'VijA/l J l^VGLDVAQS 

WEKAIRQALMPVTPQAS VCTS PGQGKDHKDLSTS IS VSNCQ I QE 

NVDISTVYQIFADEVLGSGQFGIVYGGKHRKTCRDVAIKVIDKM 

RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFWMEK 

LHGDML EM I LS SE KS RLPER I T KFMVTQ I LVALRNLH FKN 3TVHC 

DLKPENVLIiASAEPFPQVKLCDFGFARI IGEKSFRRSWGTPAY 

LAPEVLRSKGYNRSIiDMWSVGVIIYVSLSGTFPFNEDEDINDQI 

QNAAFMYPPNPMREISGEAIDLINNLLQVKMRKRYSVDKSLSHP 

WLQDYQTWLDLREFETRIGERYITHESDDARWEIHAYTHNLVYP 

KHFIMAPNPDDMEED P 


5463 


237 


1012 


LLSVTMTTSRCSHLPEVIiPDCfSSAAPWKTVEDCGSLVNGQPQ 
YVMQVSAKDGQLLSTWRTLATQSPFNDRPMCRICHEGSSQEDL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co r re spandi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


iwixiiv a.v.A.u aeymeiic concain^ng Bignai peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, Iolsoleucine, K^Lysine, 
L=Leucine, K=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V»Valine, 
W= Tryptophan, Y-Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide d^1f»Mon 
\«possible nucleotide insertion) 








LSPCECTGTLCTIHRSCLEHWLSSSNTSYCELCHPRFAVERKPll 
PLVEl^LRNPGPQHEKRTLFGDMVCFLFITPLATISGWI,CLRGAV 
DHLHFS SRLEAVGL I ALTVALFTI YLFWTLVS FRYHCRLYNE WR 
RTKQRVILLI PKSVNVPSNQPSLLGLHS VKRNSKETW 


4464 


195 


677 


SPSMNPRKKVDLKLIIVGAIGVGKTSLLHQYVHKTFYBEYQTTIj 
GASrLSKIIIUSDTTLKLQIWDTGGQERVRSMVSTFYKGSDGCI 
LAFDVTDLES FEAIiDI WRGDVLAKI VPMEQS Y PMVLLGNKIDLA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5465 


5278 


3348 


KGDPREFIRVHREALECDYVSAHLHEWIDLIFGYKQQGPAAVEA 
VNVFHHLFYEGQVDIYNINDPLKETAT1GFINNFGQIPKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
KELKEPVGQIVCTDKGlbAVEQNKVI/IPPTWNKTFAWGYADLSC 
RLGTYESDKAMTVYECLSEWGQILCAICPNPKLVITGGTSTWC 
v r» drio i o ivc i\z\rv a v i i»wj>\Jj1ajH I u I v I CA 1 AS LiA YH 1 1 VSGSR 
DRTCIIWDUJKrjSFLTQLRGHRAPVSALCIJTELTGDIVSCAGTY 
IH VWS INGNP I VS VNTFTGRS QQ 1 1 CCCMS EMNEWDTQNVI VTG 
HSDGWRFWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 
DEDSSDSEADEQSISQDPKDTPSQPSSTSHRPRAASCRATAAWC 
TDSGSDDSRRWSDQLSLDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNP I E VRNYS RLKPGYR WERQLVFRSKLTMHTAFDRKDNAHF A 
EVTAIiGISKDHSRILVGDSRGRVFSWSVSDQPGRSAADHWVKDE 

SSPVRVCQNCYYNLQHERGSEDGPRNC 


5466 


3 


992 


HACAHASAHASGRl, VRWWRKRRS VMG IQTS P VLLASUJ VGL VTL 
LGLAVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVG KH I YXSTR I DGSLYI RP YTPVTS DEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLJCVGDWEFRGPSGL 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHPPKDWAYSKGFVTADMIREHLPAPGDDVLVLLCGPPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GEALRVGT RGC RRDL PD PQAR I F I Q KKDLEEDES VTAAH LKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEMECLQPDQRTL 
YRDVMLENYSHlilSLAGSSISKPDVITLLEOBKEPWMWRKETS 
RRYPDLELKYGPEKVS PENDTSEVNLPKQVI KQ I STTLGI EAFY 
FRNDS E YRQFEGLQGYQEGNINQKM I S YE KL PTHT PHAS L I CNT 
HKP YECKECGK YFSCGSNL I QHQS I HTGEKPYK CKECGKAFQLH 
IQLTRHOKFHTGEKTFECKFrGKAPOT.DTY^T.MDHVMTirrir'irirT o 

ECKECGKSFNRSSKLTQHQSIHAGVKPYQCKECGKAFNRGSNLI 
QHQ KIHSNE KPF VCKECGMAFR YHYQL I EHCQ I HTGEKP FE CKE 
CGKAFTLLT KL VRHQ KI HTGE KP FE CRECGKAFS IjIjNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FKRGAHLIQHQKI HSNEKPFVCRECEMAFRYHCQiil EHSR IHTG 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKPF 
ECKECGKAFRLHMHLIRHQKLHTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGEKPFECKECX3ICVFSLPTQLNRHKNIHTGEPCAS 


5468 


225 


2976 


S FLTDLFQSLAQLENIiCKQLYETTDTTTRLQAEKALVEFTNS PD " 

CLSKCQLLLERGSSSYSQLLAATCLTKLVSRTNNPLPLEQRIDI ' 

RNYVLNYIATTRPIOaATFVTQALIQLYARITKl^WFDCQKDDYVF 

RNATTDVTRFLQDS VEYC I IGVTI LSQLTNEINQVSATAFL I EA 

DTTHPLTKHRKIASSFRDSSLFDIFTLSCNLLKQASGKNLNLND 

ESQHGLLMQLLKLTHNCLNFDFIGTS TDESSDDLCTVQ IPTS WR 

SAFLDSSTLQLSTIGRCEYEKTCALLVQLFDQSAQSYQELLQSA 

SASPMDIAVQEGRLTWIiVYIIGAVIGGRVSFASTDEQDAMDGBL 

VCRVLQLMNLTDSRLAQAGNEKLELAMLSFFEOPRKIYIGDGVQ 

KSSKLYRRIiSEVLGLNDETMVLSVFIGKIITNLKYWGRCEPITS 

KTLQLLNDLS IGYs S VRKLVKLSAVQFMLNNHTSEHFS FLGINN 

QSNLTDMRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPLTAAFE 

&VAQMFSTNS FNEQ EAKRTL VG LVRDLRG I AFAFKAKTS FMMLF 

EWIYPSYMPII^RAIEI,WYHDPACTTPVLKU1AELVHNRSQRLQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A*=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H^Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Nethionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 

W=TryptODhan. Y=Tvro«?-inp X— Tlnlrnrtur-i * _ C f. 

Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








FDVSSPNGlLLFRETSKMITMYGNRlLTLGEVPtaXJVYALKL'KG 
I S I C FSML KAALSG S Y VNFG VFRLYG DDALDNALQTFI KL LLS I 
PHS DLLDY P KLSQS YYS LLEVIiTQDHMN F IASLE PHV I M Y I LS S 
ISEGLTAIJ>TMVCTGCCSCIiDHIVTYLFKQLSRSTKICRTTPI^Q 
ESDRFLHIMQQHPEMIQQMLSTVLNI 1 1 FEDCRNQWSMSRPLLG 
LILLNEKYFSDLRNSIVNSOPPEKQQAMHLCFENLMEGIERNLL 
TKNKDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 


DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ' 
EPTCVSDYMSISTCEWKMNGPTNCSTELRLLYQLVFLLSEAHTC 
VPENNGG AG CVCHLLMDDWS ADNYTLDLWAGQQLLWKG S F KPS 
EHVKPRAPGNLTVHTNVSDTLLLWSNPYPPDNYLYNHLTYAVN 
IWSENDPADFRIYWTYLEPSLR1AASTLKSGISYRARVRAWAQ 
CYN7TWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CYVS I TKIKfGSWWDQI PNPARSRLVAI I IQDAQG SQWEKRS RGQ 
E PAKC PHWKNCLTKLLPCFLEHNMKRDEDPHXAAKEMP FQG SGK 
SAWCPVEISK7VLWPESISWRCVEL7EAPVECEEEEEVEEEKG 
SFCASPESSRDDFQEGREGIVARLTESIjFLDLLGEENGGFCQQD 
nvj c & ujjIjF vm* 5 TS AHM P WDE F PS AG P KEAP PWG KEQ P LH L E PS 

PPASPTQSPDNLTCTETPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLLARHLEEVEPEMPCVPQLSEPTTVPQPEPETWEQILRRMV 
LQHGAAAAP VSA PTSGYQE FVHAVEQGGTQASA WGLGPPGEAG 
YKAFSSLLASSAVSPEKCGFGASSGEEGYKPFQD^IPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEQATDPLVDSLGSGXVYSALTCHLCGHLKQCHaQEDGG 
OTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 

SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACKIKTSLNRGIAAVKKnAVEMLASYGliAYSLMKFFTtepMSDF 
MMVtjjjVi? VNbiU<DRTKAVLCT4VvAGAIAAVFHTLIAYSDliGYYI 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
S FL VGCAS ISD VI AQ WFVA I LLHSHLE CR EPLL I P I LSL YMG *V 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPLA 
L I LATQRI SRP I VNLFVS RDLGG SS AAT EAVA I LTAT YP VGHM P 
YGWLTE I RAVY PAFDKNN PSNKLVSTSNTVTAAH I KKFTFVQ4A 
LSLTLCFVMFWTPNVSEKILIDIIGVDFAFAELCWPLRIFSFF 
PVPVTVRAHLTGWU4TLKKTFVLAPSSVLRIIVLIASLWLPYL 
GVHGATLGVGSLLAGFVGESTMDAIAACYVYRKQKKKMENESAT 
EGEDSAMTDMPPTEEVTDIVEMREENE 


S471 ■ 




658 


Acacmxr r\j r\4t\t\t\/\t\i /WuuvUj V tTCAAAAAyGGGGGE PRRTEGV 
GPGVPGEVEMVKGQPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAIKKlSPFEHQTYCQRTLREIQILLRFRHENVIGIRDIliR 
ASTLEAMRDVY I VQDLMETDLYKLLKSQQLSNDHIC YFLYQI LR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLS 
NR PI FPG KH YLDOLNHI LG I P <Jfi R n T .Mr t t mmv a n mvt r\ o t 

PS KTKVAWAKLFPKSDSKALDLLDRMLTPKPNKR ITVEE ALAH P 

YLEQY YDPTDEP VAEE PFTFAMELDDLPKERLKEL I FQETARFO 
PGVLEAP 


5472 


1469 


753 


L YVMARYIiSDEE VA VS IDRLCKANGRS PS IPFGTVRI PGRARVR 
DPQALW I FGYGS LVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 
GSDKMPGRWTLLEDHEGCTWGVAYQVQGEQVSKALKYLNVREA 
VLGGYDTKEVTFYPQDAPDQPLKAXiAYVATPQNPGYLGPAPEEA 
IATQILACRGFSGHNLEYLIjRVRDVMQLCGPQAQDEHLAAIVDA 
VGTMLPCPCPTEQALALV 


5473 


3 


2119 


FMNVKLLIQDU5DIEQRVPVMDAQYKIITKTAHLITKESPQEEG " 

XEMFATMS KLKEQLTKVKEC YS PLLYESQQLL I PLBELEKQMTS 

FYDSLGKINE1ITVLEREAQSSALFKQKHQELLACQENCKKTLT 

LIEKGSQSVQKFVTLSN^KHFDQTRLQRQIADIHVAFQSMVKK 

TGDWKKPn^TNSRLMKKFEESRAELEKVLRIAQEGLEEKGDPEE 

LLRRHTE F FS Q LDQR VLNAFLKACDELTD I LPEQEQQGLQEAVR 

KIjHKQWKDLQGEAPYHLLHLKIDVEKNRFIjASAEECRTELDRET 
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SEQ 
ID 

NO: 


"predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


>i3cyu»B»iu vuiitaining signs i. peptide 
(A^Alanine, C=Cysteine. D=Aspartic Acid, E a 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, lalsoleucine, K-Lysine, 
L=> Leucine, ^-Methionine, N=Asparagine, 
P= Proline, Q-Glut amine, R=Arginine, 
So Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGSEKIIKEHHVKFSDKGPHHLCEKRLQLIEBLCVKLPV 
RDPVRDTPGTCHVTLXELRAAIDSTYRKLMEDPDKWKDYTSRFS 
EFSSWISTNETQLKGIKGEAIDTANHGEVKRAVEEIRNGVTKRG 
ETLSWLKSRLKVLTEVSSENEAQKQGDELAKLSSSFKALVTLLS 
EVE KMLSNFGDCVQY KE I VKNSLE ELI SGS KEVQEQAEK1 LDTE 
NLFEAQQLLLHHQQKTKRISAKKRDVQQQIAQAQQGEGGLPDRG 
HB EI.R KLESTLDGLERS RERQERRIQVTLRKWERFETNKETWR 
YLFQTGSSHBRFLSFSSLESLSSELEQTKEFSKRTESIAVQAEN 

LVKEASEIPLGPQNKQLLQQQAKSIKEQVKKLEDTLEEEYVIDK 
S 


5474 


2 


780 


TPDVRQIiQASRRGIAVASWCSPRWFAGEEMAPVK-q-SMr.T poctt 
LKRWKKNWFDLWS DGHL I YYDDQTRQN I EDKVH M PMDC IN I RTG 
QECRDTQPPDGKS KDCMLQ I VCRDGKTI SLCAESTDDCLAWKFT 
LQDSRTNTAY VGSAVMTDETS WS S PP P YTAYAAPAPE VGRTLS 
LQQAYG YGP YGGA YP PG TQ WYAANGQAYA VP YQ Y PYAGL YGQQ 
PANQVI IRERYRDNDSDLALGMIiAGAATGMALGSLFWVF 


5475 


2 


506 


ARGWLESLSiTCOTTPPPSSPCLLHSPETFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNISLAVRKIALLIiKPDKEIEHQGNHMTVRTL 
STFRNYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRKVR 


S475 


192 


1457 


^ ° •«»»-m-»*"'v~x'v».i oKA^ v *ioijftFc»l\yi>t 1 bJLHQYLiVDEPTLSWSR 
PSTRASEVLCSTNVSHYELQVEIGRGPDNLTSVHliARHTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYWTVFTVG 
SWLWVISPFMAYGSASQLLRTYFPEGMSETLIRNIZiFGAVRGLN 
YLHQNGC IHRS I KASHI L I SGDGL VTLS GLS HLHSL VKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 

SGVUSGIGESVLVSSGTHTVNSDRLHTPSSKTFSPAFFSLVQLC 
LQQDPEKRPSASSLLSHVFFKOMK'RRQoncTr ct t tmAwrvno-r 

SLPPVLPWTEPECDFPDEKDSYWEF 


5477 


3 


1044 


RGNSRLRYSHEDELQLPRbPELFETGRQIiLDEVEVATEPAGSRI 
VQEKVFKG LDLLE KAAEMLS QLDL FSRNEDLEEI AS TDLK YLL V 
PAFQGALTMKQVNPSKRLDHLQRAREHFINYLTQCHCYHVAEF2 
LPKWNNSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 
HRLSAMKSAVESGQADDERVREYYLLHLQRWIDISLEEIESIDQ 
E I KI LRERTJ SSRE AS TS NS SROERP PVKPFT LTPNM AO A mrvn a 

GYPSLPTMTVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVRI WVPNVKGESTVFRAHTATVRSVHFCSDGQS FVTASDDKT 

VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 

LWDKSSRECVHSYCEHGGFVTiVDFHPSGTCIAAAGMDt^TVKVW 

DVRTHRLLQHYCLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 

MEGRLLYTLHGHC^3PATTVAFSRTGEYFASGGSDEQVMVWKSNF 

DIGDHGEVTKVPRPPATLASSMGNLTVSILEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5479 " ■ 


2 


835 


KTVRI WVPNVKGESTVFRAHTATVRSVK FCSDGQS FVTASDDKT 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKW; 
DVRTHRLLQH YQLKS AAVNGLS FHPSGNYL I TASSDSTLKILDL 
MEGRLLYTLHGHCX5PATTVAFSRTGEYFASGGSDEQVMVWKSNF 

DIGDHGEVTKVPRPPATLASSMGNLTVSILEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5480 - 


444 


1952 

] 


LSLTSRMEEAELVKGRLQAITDKRKIQEBISQKRLKIEEDKLKH ' 
QH LKKKALR EKWLLDG I S SGKEQ EEM KKQNQQ DQHQ I QVLEQS I 
LRLEKE IQDLE KAEhQ I S TKEEA I LKKLKS I ERTTED 1 1 R S VKV 
EREERAEES IEDI YANI PDLPKS YI PSRLRKE INEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSSI PLPSDDFKGTGI KVYDDGQ 
KSVYAVSSNHSAAYNGTDGtiAP VEVEELLRQASERNSKS PTEYH 
EPVYANP FYR PTTPQRET VTPGPNFQERI KI K1VGLG1G VNES I 
4NMGNGLSEERGNNFNHISPIPPVPHPRSVIQQAEEKLHTPQKR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nui^itw ci.wxu ocymeni. uuiiudining signal peptide 
{A*Alanine, C=Cysteine, D=Aspartic Acid, E=r 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Hisfcidine, I=Isoleucine, K-Lysine, 
L- Leucine , M=Methionine. NaARnararSnp 
P= Proline, Q=Glutamine, R=Arginine, 
S=5erine, T= Threonine, V=Veline, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWEESNVMQDKDAPS PKPRLS PRKTI FGKSEHQNSSPTCQE 
DEED VR YNI VHS LPP DI NDT E P VTMI FMG YQQAEDS E E DKKFLT 

GYDGIIHAELWlDDEEEEDEGEAEKPSyHPlAPHSQVYQPAKP 
TPLPRKRSEASPHEKHK3 


5461 


3 


1422 


NS?GSVCIjCOCVCPSLLHCI,PPIiLLLLt.LPLLLHESPQPPALRV ' 

QGLNEAGDDLEAVAKFLDSTGSRLDYRRYADTLFDILVAGSMLA 
PGGTRI DDGDKTKMTNHCVFS ANEDHET IRN YAQ VFNKL I RR Y K 
YLEKAFEDEMKKLLLFLKAFSETEQTKIiAMLSOILLGNGTLPAT 
I IiTS L FTD S LVKEG I AAS FAVKL FKAWMAE KD ANS VTS S LRKAN 
LDKRLLELFPWRQSVDHFAKYFTpAGLKELSDFTiRVGXJSLGTR 
KELQKE LOERLSQECP I KEWLYVKEEMKRNDLPETAVIGLLWT 

CIMNAVEWNKKEELVAEQALKHLKQYAPLLAVFSSQGQSELILL 
OKVOEYCYDNIHFMKAP'OIfTWT.PVK'anvT cpprtt vMvim-mru 

AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


§482 


1452 


528 


THVVMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPSRNLSLRI, " 
EGLQEKDSGPYSCSVNVQDKQGKSRGHSIKTbELNVLVPPAPPS 
CRLQGVPHVGANVTLSCQSPRSKPAVQYQWDRQLPSFQTFFAPA 
LDVIRGS LS hTNhS S SMAGVYVCKAHNEVGTAQCNVTLEVSTG P 
u/vttv vrtbAv vvj 1 JLjVUjj^JjIjAIsIjVijIjYHRRGKALEEPANDI KEDA 
I APRTLP W PKS S DT I S KNGTLS S VTS ARALR P PHG P PR PG ALT P 

TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


""§483 


1 


788 


FFFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLQLRLTRA 
ENRIKQLBTDSSEEISRYQEMIQKL<2NVLESERENCGLVSEC3RL 
KLQQENKQLRK3TESLRKIALEAQKKAKVKISTMEHEFS I KERG 
FEVQLREMEDSNRNSIVELRHLLATQQKAANRWKEETKKLTESA 
FIRINNLKSELSRQ KLHTQELLSQLEMANE KVAEME KL ILEHQ E 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNIiENI 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 
ESDQDERGDSGQPSNKEIiFGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSB 
AEGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 
LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 
SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
P I P ETR I EVE I P KVNTDLGNDL YF VKLPNFLS VE PRP FDPQ Y YE 
DE FED E EMLDEEGRTR LKJLXVENT IR WR I RRDE EGNEI KESNAR 
I VKWS DGSMS LHLGNE VFDVY KAPLQGDHNHLF I RQGTGLQGQ A 
VFKTKLTFRPHS TDSATHRKMTLS IiADRCSKTQ KI R I L PMAGRD 

PECQRTEMIKKEEERLRASIRRESQQRRMREKQHQRGLSASYLE 
PDRYDEEEEGEES ISLAAI KNRYKGGIRERRJVR TVQcncnrr»ct? 

EDKAQRLliKAKKLTSDEVRPNLFNSRGLSCTQEPTALNEELTDQ 
AGTN 


5485 


161 


1074 


KRKHiSSMMDSEAHEKRPPILTSSKQDISPHITNVGEMKHYLCG 
CCAAFNNVAITFPIQKVLFRQQLYGIKTRDAILQLRRDGFRNLY 
RGILPPLMQKTTTLALMFGLYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTT EAI FTPLER VQT LLQDHKHHD KFTNT YQ AF KALKCHG I 
GEYYRGLVP1LFRNGLSNVLFFGLRGPIKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFP INWKTRIQSQIGGEFQS FPKVFQKI 
WbERDRKLINLFRGAHLNYKRSLISWGIINATYEFLLKVI 


5486 


1404 


142 


IPGSTISWSPAAARGLSVCRCCRLHPASAMDLFGDLPEPEESPR 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPI.LFDDLPPASSGDS 
GSLATSISQMVKTEGKGAKRKTSEEEKNGSEELVEKKVCKASSV 
IFGLKGYVAERKGEREBMQDAHVILNDITEECRPPSSLITRVSY 
FAVFDGHGGIRASKFAAQNLHQNLIRKFPKGDVISVEKTVKRCIj 
LDTFXHTDEEFLKQAS SQKPAWKDGSTATCVLAVDN I L YIANLG 
DSRAILCRYNEBSQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 
PGR VLGVLE VSRS IG DGQY KRCG VTS V P D I RRCQLT PNDR F I LL 
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SEQ — 

ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


freaictea ena 
nucleotide 

1 ArA t" i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
vjiutduiic hciu, r — f neny laianine , Lj^oiyctne, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine f 

S=Serine, ^Threonine, V-Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFK V FTPEEA VN F ILSCbE D EK IQTREGKS AADAR YEAAC 
NRLANKAVQRGSADNVTVMWRIGH 


5487 


535 


" 182 


AV < 51jSOTRr!i/VrPAPVDiiDr/>DPDCwr'nMCD\rrr mi? i «nr 

•rvv ou^inui^i rrtr v truirt^f ^IrolNLJjnCjKV 1 Li/vu 1 1 Lu\\y Li i A 

LEANDPFANKDDPFYYDWKNLQLSGLICGGLLAIAGIAAVLSGK 
CKCKSSQKQHSPVPEKA1PLITPGSATTC 


5466 


1072 


259 


AHAASGE PQRQWQEEVAAWWGSCMTDLVS L.TS RLP KTGETIH 
GHKFFIGFGGKGANQCVOAARLGAMTSMVCKVGKDSFGNDYIEN 
LKQND I STE FTYQT KDAATGTAS 1 1 VNNEGQNI I VT VAG ANLLL 
NTEDLRAAANVISRAKVMVCQLEITPATSLEALTMARRSGVKTIi 
rWfAt'AlAUJjUPyrY 1 IjSXJ VF CCKESEAEILTGLTvGSAADAGE 
AALVLLKRGCQ WI ITLGASGCWLSQTEP EPKH I PTEKVKAVD 
TTVSFKI 


5489 


81 


893 


GKGPVAAFIDQSNIFLTDPXIFLGQWREBPKMPLLLLGBTEPtK 
L E R D CRS P VE PW AAAS P DLALACLCHCQDLS SG AFPNRG VLGG V 
t FPTVEMVI KVFVATSSGS I AIRKKQQE WGFLEANKI DFKELD 
IAGDEDNRRWMRENVPGEKKPQNGIPLPPOIFNEEQYCGDFDSF 
FSAKEENIIYSFIjGIiAPPPDSKGSEKASEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEET3EIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNIFLTDPKXFbGQyjREEPKMPLLLLGE'TEPLK 
LERDCRSPVEPWAAASPDLAIACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEANKIDFKEIiD 
IAGDEDNRRWMRENVPGBKKPQNGIPLPPQIFNEEQYCXSDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPE AQE KNEEEGE TATEETEE I ANEGAEGEAE E EE ETAEGEE P 
GEDEDS 


5491 


204 


1194 


GSAPRLSLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE ■ 
QGDMASSFLPAGAIlt3DSGGELiSSGDDSGEVEFPHSPEIEETSC 
IiAELFEKAAAHLQGLIQVASREQLLYLYARYKQVKVGNCNTPKP 
SFFDFEGKQKWEAWfCAljGDSSPSQAMQEYIAWKKLDPGWNPQI 
P EKKG KE ANTG FGG PVIS SLYHEBTI RE EDKNI FDY CRENN I DH 
ITKAI KS KNVDVNVKDEEGRALIjHWACDRGHKEI>VTVLLQHRAD 
INCQDNEGQTALHYASACEFLDIVELLLQSGADPTLRDQDGCLP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


" 1896 


AS KNPLS AVCTTG I MSSEoAVRDPAMDRSLRSVFVGN I P YEATEEl 
QLKDI FSE VGSWS FRLVYDRETGKPKG YGFCEYQDQETALSAM 
RNUJJGREFSGRALRVDNAASEKNKEELKSLGPAAP 1 IDSPYGDP 
IDPEDAPE S I TRAVASLPPEQM FELMKQMKLCVQNSHQEARNML 
LQNPQIAYALDQAQ WMRIMDPE I ALKILHRKIHVTP LI PGKSQ 
S VS VSG PG PGPGPGLCpGPNVLLNQQN P P APQ PQKLARRP VKD I 
PPLMQTP IQGGI PAPGP I PAAV PGAGPGS LTPGGAMQPQLGMPG 
VG P V PLERGQ VQMS DPRAP I P RGP VTPGG L P PRGLLG D A PND P R 
GGTLLS VTGEVEPRGYIjGPPHQGP PMHHASGHDTRGPS SHEMRG 
G PLGDP RLL I GEPRG P M I DQRGL PMDGRGGRDSRAM ETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRG PVPSS RGP 
MTGGIQGPGP INIGAGGPPQGPRQVPGI SGVGNPGAGMQGTG IQ 
GTGMQGAGIQGGGKQGAGIQGVS IQGGG IQGGGIQGAS kqggsq 
PSSFSPGQSQVTPQDQEKAALIMQVLQLTADQIAMIiPPEQRQSI 
MLKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPEEPRKPGRLTQALNSPLTWEHVWICVPGGTPDCIi 
TDTFRVKRPHLRRSASNGHVPGTPVYREKEDMYDEI I ELKKSLH 
VQKSDVDLMRTKLRRLEEENSRKDRQIEQLLDPSRGTDFVRTLA 
EKRPDASWVINGLKQRILKLEQQCKEKDGTISKLQTDMKTTNLE 
EMRIAMETYYEEVHRZjQTLL^SETTGKKPLGEKKTGAKRQKKM 
GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERLRGAVRDLFG3ER TALQ EQLLQRDLE VKQLL 
QAKADLEKELECAREGEEERREREEVLREEIQTLTSKIiQELQEM 
KKBEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSBEGLPRP 
RS PCSDGRRDAAARVLQAQWKVYKHKKKKAVLDEAAVVLQAA FR 
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SSQ 
ID 
HO: 



Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



71 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5495 



273 



536 



2168 



Amino acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D=Aspartic Acid, B» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, ^Methionine , N=Asparagine, 
P^Proline, Q=Gluta:nine, R=Arginine, 
S=Serine, T=Threonine, V-Valinc, 
W=Tryptophan, Y- Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\°po3sible nucleotide insertion) 
GHLTRTKLIAS KAHGS E P PS VPGLPDQSS PVPR VPS P I AQATCS" 
PVQEEAI VI IQSALRAHLARARHSATGKRTTTAASTRRRSASAT 

HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIV1AP 
SLPTKNFPV 



RSKAKIGTPTRBVPSTDMKVRRESSSSLTH RPAPSPATPRLLGT 
RR VLLG VS EGTGCADAME LVLVFLCS LLAPMVIASAAE KE KEKD 

PFHYDYQTLRIGGLVFAWLFSVGILtilLSRRCKCSPNQKPRAP 
GDEEAQVENLITANATJBPQKAEN 



DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG 
ELRPASLWLPRSLAPAFERFCQVNTGPLPLLGQSEPEKWMLPP 
OGAISETRMGHPQFWKYEFGACTGSIiASLEQySEQLKDMVAFFL 
GCS FS LEEALE KAGL PRRDPAGH S QAGAYKTT VP CVTHAG FCCP 
LVVTMRPIPKDKLEGLVRACCSLGGEQGQPVHMGDPBLLGIKEL 
SKPAYGDAMVCPPGEVPVFWPSPLTSLGAVSSCETPLAFASIPG 
CTVMTDLKDAKAPPGCLTPER1PEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
GFPTHFNHEPPEETDGPPGAVALVAFLQAJjEKEVAIIVDQRAWN 
LHQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAFLCKNGDPQT 
PRFDHLVAIERAGRAADGNYYNARKMNIKHbVDPIDDLFLAAKK 
I PG ISSTGVGDGGNELGMGKVKEAVRRHIRHGDVI ACDVEADFA 
VIAGVSNWGGYALACALYILYSCAVHSQYLRKAVGPSRAPGDQA 
WTQALPSVIKBEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMIQKIiVDVTTAQV 



2408 



yUTKMHEIYKGNITPQLNKNTLKTSAATDVWAVYFSQFWIDYSG 
MKSGKGRPISPVDSFPLSIWICQPTRYAESQKEPQTCNQVSLNT 
SQSBSSDLAGRLKRKKLLKEYYSTESEPLTNGGQKPSS5DTFFR 
FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLIiFIiHESLILLSE 
NLR KDVEAVTGS PASQTS I C I GI LLRSAELALLLH P VDQANTLK 
SPVSESVSPWPDYIiPTENGDFLSSKRKQISRDINRIRSVTVNH 
MSDKRSMS VDLS H I PLKDPLL FKSASDTNLQKGI S FMDYLS DKH 
LGK I S ED ES SG LVY KSGSGE I GSETS DKKDSF YTDS SS VLN YR 3 
DSN Iks FDS DGNQNI LS S TLTS KGNETI ES I FKAE D LL PEAASI* 
S ENLD I S KE ET P P VRTLKSQSS LSG KP KERCP PNLA PLCVS YKN 
MKRSSSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKGNKKNS 
TTNYRGTAESVNAGANLQNYGETS PDAI 5TNSEGAQENHDDLMS 
WVFKITGVNGEIDIRGBDTEICLQVNQVTPDQLGNISLRHYLC 
NRPVGSDQKAVI HS KSS PE I SLRFESGPGAVIHS LLAEKNG FLQ 
CHIKNFSTEFLTSSLMNIQHJFLEDETVATVMPMKIQVSNTKINL 
KDDSPRSSTVSLEPAPVTVHIDHLWERSDDGSFHIRDSHMLNT 
GNDLKENVRSDSVLLTSGKYDliKKQRSVTQArQTSPGVPWPSQS 
ANFPEFSFDFTREQLMEENESLKQELAKAKMALAEAHLEKDAIiL 



5497 



1821 



HHIKKMTVE 



3308 



ISKLLKRRSNIDAYLLSNSCAFFAPRLFSLASQIIREQQSPNV 
CFrYKYSGFPSLECQCHFVSPHSSCYIWFFSFPPPFFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQIPSW105WAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
LALSRGLQLDTQRSSRDStiQCSSGYSTQTTTPCCSEDTIPSOVS 
DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
SrAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPlPIK 
T P VI PVKTPTVPDL PGVLPAP PDGPEERG EHS PES PSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPBEHRQAIPBSEAEDQER 
EPPSATVSPGQrPESDPADLSPRDTPQGEDMLNAIRRGVKLKKT 
TTNDRSAPRFS 



-243r 



T49F 



1 iiTHQE I FTQE KPCE CGKA5 I QMS H LS QQKI YSGENP FACKVCG " 

KVFSHKSNLTEHEHFHTREKPFECNECX3KAFSQKQYVIKHQNTH 

TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 

QKSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 

PFKCSECGTAFGQKKYLIKHQNIHTGBKPYECMECX3KAFSQRTS 

LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 

NECGKAFSQFSTLALHIiRIHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
xu 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


.Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid # E= 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N=»Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S»Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
QKIHTH " — 


5459 


324 


926 


GPGQIGRGHKITTYPFSPRKSGRKGMAQSQGWVKRYIKAFCKGF 
PVAVPVAVTFLDRVACVARVEGA9Mno<5T.NDr:r:crkCcn\rirT t htu 

WKVRNFEVHRGDI VSLVS PKNPEQKI I KRVIALEGDIVRTIGHK 
NR YVKVPRGH 1 WVEGDHHGH5 FDSNS FGPVS LGL LHAHATH I LW 
P PERWQKLESVLP PERLP VQREFJE 


5500 


1978 


12B6 


KPDWRLQNLPPRLYLWRSSRFGFGHLKKRLQMDFKIEHTWDGFP " 

VKHEPVFIRLNPGDRGVMMDISAPFFRDPPAPLGEPGKPFNELM 
DYEVVEAFFLNDITEOYTjFVPT.PPWsritrr.'irr t t cpDDxnmvnnT 

PLSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 

YEAIiYPVPQHBLQQGQKPDFHCLEYFKSFNFNTLLGEEWKQPSS 
DLWLIEKCDI 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFSLP 
AAIMIALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVXA 
SGQFSEDMIPTVGFNMRKVTKGNVTIKIWDIGGQPRFRSMWERY 

CRGVNAI WMTHAlVnT5PVTTraaDXTC«T UMT t nvnm — _,_ ... 

vnovimi vii >j.iJ*-ifxLJCs.cjX\.x ii/^rCI\iiJjrlx^ljLUK.FQIjQG I PVLiVIjG 

NKRDLPNALDEKQLIEKMNLSAIQDREICCYSISCKEKDNIDIT 
LQWLIQHSKSRRS 


5502 


3 


824 


NSAFPVWVPERTAJjLTCPLGAAPGSSREAPGIAGPPNSTAMSKL 
WMr *^°^ S3J ^ K ^^ F oVUh^UjVKliRETEEMIjGKKOEYLENRIQ 
REIALAKKHGTQNKRAALQALKRKKRFEKQIjTQIDGTLSTIEFQ 

realenshtwevlrnmgfaakamksvhenmdlnkiddlwqeit 
eqqdiaqeiseafsqrvgfgddfdedelmaeleeleqeelnkkm 
tkirlpnvpssslpaqpnrkpgmsstarrsraassgraeeeddd 

IKQLAAWAT 


5503 


216 . 


654 

^ 3563 


KGVRRRGRVRSDSEDSHU3YFKMSFLLPKLTSKKEVDQAIKSTA 
E KVLVLR FGRDED PVCLQL DD I LS KTS S DLS KMAA I YL VD VDQT 

AVYTQYFDI S YI PSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVNS 


5504 


58 




QLSFSFQAPVTFDDITVYLLQBEWVLLSQQQKELCGSNKLVAPL 
GPrVANPELFRKFGRGPEPWLGSVOGQRSLLEHHPGKKOMGYMG 
EMEVQGPTRESGQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPStRDKRSRL 
I EGYTGP FKVETLKYHAKS KAHMFCVNALAARDPI WAARFRS I R 
DPPGDVLASPEPLFTADCPIFYPPGPLGGFDSMAELLPSSRAEL 
EDPGGIX3AIPAMYIjDCISDI/RQKEITDGIHSSSDINILYNDAVE 
SCI QDP SAEGLSE EVP WFEELP WFEDVAVYFTREEWGMLDKR 
QKEL YR DVMRMNY ELLAS LG PAAAKP DL I S KLERRAAP WI KD PM 
GPKWGKGRPPGNKK^AVREADTQASAADSALLPGSPVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSV/FGQFPWLVIDPKETKL 
FCSACIERPNLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 
NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 
DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 
VRNS PCVS VLLDSSTDAS EQACVG I YIRYFKQMEVKES YITLAP 
LYS ETADG YFET I VS ALDELDI P FRKPG WWGLGTDGS AMLS CR 
GGLVEKFQE VI PQLLPVHCVAHRLHLAWDACGS I DLVKXCDRH 
IRTVFKFY0SS2 , ?KRLNEIiOEGAAPl J EnFTTPr.jmT m&i/d&m/acc> 

RRTLHALLVSWPALARHLQRVAEAGGQIGHRAKGKLKLMRGFHF 
VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 
LRHQAGPKE EE FNAS FKDGRLHGI CLDKLE VAEQRFQAJDR ERTV 
LTG I E YLQQRFDADR P PQLKNME VFDTMAW PSGIELAS FGNDDI 
LNLARYFEC3LP1X3YSEEALLEEWLGLKTIAQHLPFSMLCKNAL 
AQHCRFPLLSKLMAVWCVPISTSCCERGFKAMNRIRTDERTKL 
SNEVLNMLMMTAVNG VAVTEYDPQ PAIQHWY LTSSGRR FS H VYT 
CAQ VPARS PAS ARLR KEEMGAL YVEE PRTQKPPILPS REAAEVL 
KDCIMEPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCSPRSLSAAKI^SNRNNNKLPSNLPQLQNLIKRDPPAYIEEFLQ ' ' 
QYNHYKSNVEIFKLQPNKPSKELAELVMFMAQISHCYPEYLSNF 
PQEVKDLLSCNHTVLDPDLRMTFCKALILLRNKNLINPSSLLEL 
FPELFRa^DKLLRKTLYTHIVTDIKNINAKHKNNKV^^^Vl J QNFW 
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SEQ " 
ID 

NO: 


| Predicted " 
beginning 
nucleotide 
location 
co rr e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysceine, D=Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L=Leucine, M=Methionine. N^Asparagine, 
P° Proline, Q=Glutamine, R=Arginine, 
S=5erine / T=Threonine, V=Valine, 
W-Tryptophan, Y*Tyrcsine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAAKMSLDVMIELYRRNIWNDAKTVNVITTACPSK " 
VTKILVAALTFFLGiCDEDSKQDSDSBSEDDGPTARDLLVQyATG 

E KLL KQL ECCKE RFE VKMMLMNLI S RLVG I HE LFL FN FTP FLQR 
FLQPHQRBVTKI LLFAAQASHHLVPPEI iQSIiLMTVANNFVTDK 
NSGEVOTVGINAIKEITARCPIAMTEEIJ^DlAQYKTHKDKNVhl 
MSARTLIHLFRTLNPQMLQKKFRGKPTEASIEARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSEEEDADGEWIDVQH 
SSDEEQQEISKKLNSMPMEERKAKAAAISTSRVLTQEDFQKIRM 
AQMRKEIiDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERLHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 


1 


1531 


yKGDLCGQRGGSAPGHGGSSAWPAPAkPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWSPLEGGGDEELRPNPYVRFPYRWWAWV 
LAAFP S W5AGGETPEAP PE S WTQL WFFRFWN AAGY AS FMVPG Y 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPLAPRT 
EAAETTPMWQALKLLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TS PGER FTD S Q FLVLMNR VLAL I VAG LSCVLCKQ PRHGAPM YR Y 
SFASLSNVLSSWCQYEAIiKFVSFPTQVLAKASKVIPVMLMGKLV 
SRRSYEHWEYLTATLISIGVSMFLbSSGPEPRSSPATTLSGLIL 
LAGYlAFDSFTSNWQDALFAyKMSSVOMMFGVWFFSCLFlVGSL 
LEQGALLEGTRFMGRHSE FAAHALLLS ICS ACGQLFI FYTIGQF 
GAAVFTIIMTLRQAFAILLSCLLYGHTVTWGGLGVAWFAALL 
LRVYARGRLKQRG KKAVPVESPVQKV 


5507 


3704 


1271 


PRGTRRCRP AGRASRRARRR PPC PG PAAPGS LE I GGFGT AAG KK " 

VA VAD VQ FGPMRFHQOQIiQ VLLVFTKEDNGCNG FCRACE KAG FK 

CTVTKEAQAVLACFLDKHHDI 1 1 1 DHRNPRQLDAE ALCRS I RSS 

Kr,SENTVIVGWRRVDREELSVMPFISAGFTRRYVENPNlMACY 

NELLQLE FGBVRS QLKLRACNS VFTALENS BDA I E ITSEDR FIQ 

YANPAFETTMGYQSGELIGKELGBVPINEKKADLLDTINSCIRI 

GKEWQGIYYAKKKNGDNIQQNVKIIPVIGQGGKIRHYVSIIRVC 

NGNNKAEKISBCVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 

EVSSQRRHSSMARIHSMTIEAPITKVINIINAAQESSPMPVTEA 

uutx. v iic l ijk i xtiijjfo PQt Q>AKDDDtHANDuVGGlJ4SDGIjRKLSG 

NEYVLSTKNTQMVSSNIITPISLjDDVPPRIARAMENEEYWDFDI 

FELEAATHNRPLIYLGLKMFARFGICEFLHCSESTLRSWLQIIE 

ANYHSSNPYHKSTHSADVLHATAYFLSKERIKETLDPIDEVAAL 

xnn i j. au v une \*n, a w & t ia_DJA(j bBIiAI LYNDTAVLESHHAALAF 

QLTTGDDKCWIFIWMERNDYRTLRC^IIDMVLATEMTKHFEHVN 

KFVNS INKPIjATLEENGBTDKNQEVINTMIiRTPENRTIjI KRMIiI 

KCADVSNPCRP LQYCI EWAAR 1 SEEY FSQTDEE KQQGLP WMP V 

FDRNTCS I PKSQIS F I DYF I TDMFDAWDAFVDLP D I jMQHLDNNF 

KYWKGLDEMKLRNLRPPPB 


5508 


1151 


691 


LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPA8GLRGFPN" 
\^KKVL VDQLVAS PLLGVWY FLGLG CLEGQTVGBS CQELREKFVt 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRSPVPIiTPPGCVAX>DTRAD 


5fi09 


1238 


619 


R KSRGCQNALS ASG P AAAAAA I MVRKL K FH EQKLLKQ VDFLN W E 
VTDHNLHELRVLRRYRLQRREDYTRYNQLSRAVRELARRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDWTDPAFLVTRSM 
EDFVTWVDSSKI KRHVLEYNEERDDFDLEA 


5510 


96 


1195 


PAGAHLS SGS SEPLVEPGRGR VGARVKGERGLQASGS APGRS KM " ' 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWKRSQAY 
ADYIGF I bTLNEGVKGKKLTFE YRVSEAIEKLVALLNTLDRW ID 
ETPP\^QPSRFGNKAY^TWYAKLDEEAENLVATVVPTHLAAAVP 
E VAVYLKE S VGNS TR I DYGTGH EAAFAAF LCCIjC KI G VLRVDDQ 
IAI VFKVFNR YLEVMRKLQKT YRMEPAGSQGVWGLDDFQFLP FI 
WGSSQLIDHPYLEPRHFVDEKAVNENHKDYMFLECILFITEMKT 
GPFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSIiLPIHPVTSG 
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SEQ 
ID 

MO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


— wwav* scvjuKsui ^.wiiu«*j.iiing signal peptide 
(A«Alanine, C«Cysteine. D=Aspartic Acid, E= 
Glutanic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Mechionine, N-Asparagine, 
P=Proline # Q-Glut amine, R=Arginine, 
S=>Se rine, ToThreonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=»stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


5511 


276 


1980 . 
- 


KLSRVbNLPPENLITSISAVPISQKEEVADFQtSVDSLLEKDND 
HSRPDIQ VQAKRLAEKLRCDTWS BI STGQRTVN FKI N RE LLTK 

TVLQQVIEDGSKYGLKSELFSGLPQKKIWEFSSPNVAKKPHVG 
HLRST 1 1 GN FIAN LKE ALGHO V IB TNYT /snwrMnirr i t r-rr r^t 

FGYBEKLQSN PLQHLFEVYVQVNKEAADDKS VAKAAQE FFQRLE 
LGDVOALSLWQKFRDLSIEEYIRVYKRLGVYFDEYSGESFYREK 
SQEVLKhLBSKGLLLKTIKGTAWDlSGNGDPSSICTVHRSDGT 
SLYATRDLAAA I DRMDKYN FDTMI YVTD KGQ KKH FQQ V FQM L Kl 

MGYDWAERCQHVPFGWQGMKTRRGDVTFLEDVLNEIQLRMLQN 
MAS 1 KTT KE LKN PQ BTAER VG LAAb I I QD FKGLLLS D YXFS WDR 
VFQSRGDTGVFLOYTHAHT.K<?T,PPTrrzrr , VT wncMTn^r rtt?rior , 

VSILQHLLRFDEVIiYKSSQDFOPRHIVSYLLTLSHLAAVAHKTL 
QI KDS P PE VAGARItHLFKAVRS VLANGMKLLG ITP VCRM 


5512 


120 


1015 " 


DPSi.LLTITVTGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPL 
IEKEKISHNTRR FRFGL PS PDH VLG LP VGN YVQ LLAKI DNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQY1»ENMK 
IGETIFFRGPRGRLFY1IGPGNLGIRPDQTSEPKKTLADHLGMIA 
v»u i vjx i rnLUljl Khl r KDPSDRTRMSbl FANQTEEDI LVRKELB 
E1ARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAX 
STLILVCGPPPLIQTAAHPNLEKLGYTQDM1 FTY 


5513 


2 


837 


ARWRDPSDSPRIPPAGAETPGRG^CRNYLPSSSPPPPEPSSFPS 
PPTSRGGPGSRDTWSDSEEESQDRQLKIWLGDGASGKTSbTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQ1WDIGGQTIG 
GKMLDKYI YGAQGVLLVYDITK YQS FENDED WYTWKKVS EESE 
j-Vf ij v/uj vu n k.i utfEHM RT I KPKKHLRFCQENG FSSH FVS AKTG 

DSVFLCFQKVAAEILGIlCLNKABIEOSQRWKADXVNYNaEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 


449 


VNRPSWIMGNFRGHALPGTFFFIIGLWWCTKSILKYICKKQKRT 
CYLGSKTLF YRLE ILEG IT I VGMALTGMAGEQF I PGCPHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNALFVEAFI FYNHTHGREMLDI FVHQLLVLWFTjTGLVAFL 
EFLVRNNVLLELLRSSLILLQGSWFFQIGFVLYPPSGGPAWDLM 
DHBNILFLTICFCWHYAVTIVIVGMNYAFITWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 


5515 
5Sl£ 


1572 


260 


r vtvuvvan.L»U*_UI/L»ijJ>Vl_lJl inif JJXKUijUSl.ataliKTAVVIDIjGEAF 

TKCGFAGETGPRCIIPSVIKRAGMPKPVRWQYNINrEELYSYb 
KEF I H I LYFPJ4LLVNPRDRR VVT I E S VLCPSH FRE TLTR VL FKY 
FEVPSVLIAPSHLMALLTLGINSAMVLDCGYRESLVLPIYEGIP 
VLNCWGALPLGGKALHKEl^TQIjLEQCrVDTSVAKEQSLPSVMG 
SVPEGVLEDIKARTCFVSDLKRGLKIQAAfCFNIDGNNERPSPPP 
NVDYPLDGEKlLHlLGSIRDSVVKTT.FFnnKnPFnQva'rT tt ncr 

IQCPIDTRKQLAENLWIGGTSMLPGFLHRLLAEIRYLVEKPKY 
KKALGTKTFR1HTP PAKANCVAWLGGAI FGATiOD TT r QD q vq vt? 

YYNQTGRIPDWCSLNNPPLEMMFDVGKTQPPLMKRAFSTEK 




3 


735 


NSREPPQAGPGPSPRKSPTASSFDFPWRPIiASSFWMGACXSAOE's 
I KAMWR V PGTTRRP VTGES PGMHilPEAMLLLLTLALLGGPTWAG 
KMYGPGGGKYFSTTEDYDHEITGLRVSVGLLLVKSVQVKLGDSW 
DVKLGALGGNTQEVTLQPGEYITKVFVAFQAFLRGMVMYTSKDR 
YFYFGKLDGQISSAYPSQEGQVDVGIYGQYQLLGIKSIGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR • 


5517 


246 


499 


SEIYVAMRTDSSKMTDVESGVANFASSARAGRRNALPDIQSSAA ' 
TDGTSDLPLKLE ALS VKEDAKE KDEKTTQDQL EK PQNE EK 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPCLWLGLLLPLVAALDFNYHRQEGMEA " 
FliKT VAQN YS S VTHLHS I GKSVKGRNLW VLWGR FPKEHRIG I P 
E FKYVANMHGDETVGRSLIjLHL ID YLVTSDGXDPE I TNLINSTR 
I HIMPSMNPDGFEAVKXPDCYYS I GRENYNQYDLNRNP PDAFE Y 
NNVSRQPETVAVMKWLKTETFVLSANLKGGALWIS YPFDNGVQA 
TGALYS RS LT PDDD VFQYLAHT YAS RNPNM KKGDEC KNKMNFPN 
3VTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S^^^^WKASLIF^IKQVHU3VKGQVF13QNGNPLP^AyIVEVQDRK | 
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| SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


{A= Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histldine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P*=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=»Tryptophan, Y»Tyrosine, X=UnJcnown, *«Stop 
Codon, /^possible nucleotide deletion, 
\cpossible nucleotide insertion) 








HICPYRTNKYGEYYliLLDPGSyilKVTVPGHDPHITKVI I PEKS " 
QN FS ALKKD I LLP FQGQLDS I P VSNPSC PM I PLYRNLP DHS AAT 
KPSLFLFLVSLLHIFFK 


5519 


87 


477 


I KS KLNQQVEVQESEWRLTEAKGPTMGKBSGWDSGRAAVAAWG 
G W AVGTVL VALS AMG FT3 VG IAASS IAAKMMS TAAI ANGGGVA 
AGSLVA I LOS VGAAGLS VTS KVI GG FAGTALGAWLGS P PSS 


5520 
5521 


117 


943 


PTEGRQKVLKTFTVPRSALAMTKTSTCIYHFLVLSWyTFliNYYI 
SQEGKDEVKPXILANGARWKYMTLLNLLLQTIFYGVTCLDDVLK 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFWILFLYNRDL 
IYPKVLDTVIPVWLNHAMHTFIFPITLAEWLRPHSYPSKICrcL 
TLLAAAS I AY I SR I LWLYF BTGTWVY PVFAKLSLLGLAAFFSLS 

YVFIASIYLLGEKLNHWKWVSVQILQRWRLESVGICFQWPDWKS 
PAKHQLVKNIR 




546 


911 


KILNMQKSCEENEGKPQNMPKAEEDRPLEDVPQEAEGNPCjPSEE 
GVSQEAEGNPRGGPNOPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREE IRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5522 


1224 


63 7 


GSR PLGQRS RE KM W VFG YGS L I WKVD FP YQD K L VGY I TN YS RRF 
WQG STDHRG V PGKPGR WTLVE D P AG CVWG V A Y RLPVG K EE EVK 
AYLDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDY1X3 
PAPLEDIAEQI FNAAG PSGRNTEYLF ELANS I RNLVPEKADEHL 
FALE KLVKERLEGKQNLNC I 


5523 


3 


1280 


k9n.w<viNj\i waaiiaHHiAKKr vtUUMiUvNr UHr QILiRAIGKGSFG 

KVCI VQKRDTE XMYAMKYMNKQQC I ERDEVRNVFRELEI LQ El E 
HVFLVNLWYS FQDEEDMFMWDLliLGGDLRYHLQQNVQFS EDTV 
RLYICEMALALDYLRGQHIIHRDVKPDNILLDERGHAHLTDFNI 
ATI I KDGERATALSGTKPYMAPEI FHS FVNGGTGYSFE VDWWSV 
GVMAYELLRGWRPYDIHSSNAVESLVQLFSTVSVQYVPTWSICEM 
VALLRKLLTVNPEHRLSSLQDVQAAPALAGVIiWDHLSEKRVEPG 
FVPNKGRLHCDPTFELEEMILESRPLHKKKKRLAKNKSRDNSRD 
SSQSENDYLQDCLDAIQQDFVIFNREKLKRSQDLPREPLPAPES 
RDAAE P VEDE AERS ALPMCG PI CPSAGSG 


5524 


85 


2318 


RERERDHR PG ES S QGQSGAGG C F P S P TMELRCGGLLFS S RFDS G 
NLAH VERVES LSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CAETEFEJNGNRSWFYFSVRGGMPGKLIKINIMNMNKQSKLYSQG 
MAP FVRTL PTR PR WER I RDR PTFEMTETQFVLS FVHR FVEGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
ELLCYSLDGLRVDLLTXTSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKR I FFLSSRVHPGETP S S FVFNGFLDFILRPDDPRAQTLR 
RLPVFKLIPMLNPDGWRGHYRTD^RnVNI.MPnVi VDdavt una 

IYGAKAVLLYHHVHSRLNSQSSSEHQPSSCLPPDAPVSDLEKAN 
NLQNEAQCX3HSADRHNAEAWKQTEPAEQKLNSVWIMPQQSAGLE 
ESAPDTI PPKESGVAYYVDLHGHASKRGCFMYGNS FSDESTQVE 
NMLYPKLI SLNS AHFDFQGCNFSEKNMYARDRRDGQS KEGSGRV 
AI YKASGI IHSVTLECNYNTGRS VNS I PAACHDNGRAS PPPPPA 
FPSR YT VELPEQVGRAMAI AALDMAE CNPW PR I VLS EHS S LTNL 
RAWMLKHVRNSRGLSSTLNVGVNKKRGLRTPPKSHWGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGL 
PGIX3SSTX3KVTHRVLGPVRGKPVWEPLQHVFGCLGHCWGK 


5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEEFLGRVAELNDVTAKVASGQEKHLLFEVQPGSDSSAFWKV 
VVRVVCTKINKSSGIVEASRIMNLYQFIQLYKDITSQAAGVLAQ 
SSTSEEPDENSSSVTSCQASLWMGRVKQLTDEEECCICMDGRAD 
LILPCAHSFCQKCIDKWSDRHRNCPICRLQMTGANESWWSDAP 
TEDDMAN YI LNMADE AGQPHRP 


5526 


3 


853 


RRPCN P VRAAKRTGAAARA PRGLE VTMLR VAWRTLS L I RTRA VT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDP PPSTLLKD YQNVPGI E KVDDWKRLLSLEMANKKEMLKI 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEBHLEKHRKDK 
AHKRYLLMSIDQRKKMLKNLRNTNYDVFEKICWGLGIEYTFPPL 
YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAX 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serir.e, T=Threonine, V=Valine f 
W=Tryptophan, Y=Tyroeine, X= Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








RRNPDS PAKAI PKTLKDSQ 


5527 


3225 


565 


IiLRKyLLHQNPLLliRHQPNRTCISF^ATMKLKDTKSRPKQSSCG ' 
KFQTKGIKWGKWKEVKIDPWJPADGQMDDLVCFEELTDVQLVS 
PAKNPSSLFSKEAPKRKAQAVSEEEEEBEGKSSSPKKKIKLKKS 
KNVATEGTSTQKEFBVKDPELEAQGDDMVCDDPEAGEMTSENLV 
QTAPKKKXNKGKKGLE PSQSTAAKVPKKAKTW I PEVHDQKADVS 
AWKDLFVPRPVLRALSPLGFSAPTPIQALTLAPAIRDKLDILGA 
AETGSGKTLAFAIPMIHAVLQWQKRNAAPPPSKTEAPPGETRTE 
AGAKTOS PGKAEAES DALPDDTV I ESEALPSDI AAEARAKTGGT 
VS DQALLFGDDDAGEGPSSLI RE KPVPKQNEWEEENLDKEQTGN 
h KQ ELDD K£ ATCKA YP KR PLLGL VLTPTRE LAVQVKQH I DAVAR 
FTGI KTAILVGGMSTQKQQRMLNRRPEI WATPGRLWELI KEKH 
YHLRNLRQLRCLWDEADRMVEKGH FAELSQI^iEMLNDSQYNP K 
RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKP KVI DLTRNEATVETLTETK I H CETDE KDF YLY Y FW3Q YPG 
RSLVFANSISCIKRLSGLLKVLDIMPLTLHACMHQKQRLRKLBQ 
FARLEDC VLLATDVAARGLDI PKVQHVIHYQVPRTS E I YVHRSG 
RTARATNEGLSLMLIGPEDVINFKKI YKTLKKDEDI PLFPVQTK 
YMD WKERI RLARQI EKSE YRNFQACLHNS W1EQAAAALEI EliE 
EDMYKGGKADQQEERJRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTQSGKPPLLVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 
QPSTSAN 


5528 


" 3 


895 


GPFLSACRMWGACKVKVHDSLATISITLRRYLRIjGATMAKSKFB" 
Y VR D FEADDTCLAHCW VWRLDGRNFHR FAE KHNFAKPNDS RAL 
QLMTKCAQTVMEELED1VIAYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNQT 
LKD YIiS WROADCH INNL YNT VFWAL I QQSGLTP VQ AQGRLQGTL 
AADKNEILFS EFN1N YNNE PPM YRKGTVLI WQFCVDEVMTKE IKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5523 


46 


640 


TFRLVSAHLKTRKLINPEAAERRWRC WDSRQG WLS VKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAAMEWPNLREIVEQCVLEPD 


5530 


4541 


2606 


AQ I VHAI S YCHKliHVGH RDLKP ENWF FEKQGLVKLTD FG FSNK 
FQPGKKLTTSCGSIAYSAPEILLGDEYDAPAVDINSLGVILFML 
VGGQPPFQEANDSETLTMIMDCKYTVPSHVSKECKDLITRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYECNLSEEEHN 
S 1 1 QRMVLGD IADRDAI VEALETNRYNHITAT YFLLAER I LREK 
QEKEIQTRSASPSNIKAQFRQSWPTKIDVPQDLEDDLTATPLSH 
ATVPQSPARAADSVLNGHRSKGLCDSAKKDDLPEIAGPALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRJCSAPVLNQIFEEGESDDEFDMDE^PPfCLSRLKMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMSLCLGSQLHGSTKYIIDPQNGLSFSSVKVQEKSTWKMCISST 
GNAGQVPAVGGI KFFSDHMADTTTELERI KSKNLKNNVLQLPIiC 
EKTISVNIQRNPKEGIiLCASSPASCCHVI 


5531 


24 


515 


GSQPRAPRPRDSMERPEPELIROSWRAVSRSPLEHGTVLFARLF 
ALE PDL L P LFQ YNCRQ FS S P EHCI iSS PE FLDH I RKVMLV I DAAV 
TNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


5*32 


3395 


1402 


SDWKWGKRKMIIEDETEFCGEELLHSVU3CKSVFDVLDGEEMR 
RARTRAN P YEK I RG VFFLNRAAMKMANMD F VF DRM FTNPRD S YG 
KPLVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFBPYYGEGGIDGDGDITRPENISAFRN 
FVLDNTDRKGVHFLMADGGFSVEGQENLQEILSKQLLLCQFLMA 
LS I VRTGGHFICKT FD LFT P FS VGLVYLL YCCFER VCLFKP ITS 
RPANS ER YWC KGL KVG I DD VRD YLFAVN I KLNQ LRNTDS DVNL 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rauxiiu atia segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine. 
H=Histidine, I=lsoleucine, K- Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P^Proline, Q=Glutamine , R=*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion] 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKALAKIHAFVQDTTJT 
SEPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKPPELIQGTE1 
DIFSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRWI KLDLXTELPRDTLLSVE I VHELKGEGKAQRKI 
SAI H I LD VLVLNGTDVREQH FNQR I QLAE KFVKAVS KPS R PDMN 
PIRVKEVYRLEEMEKIFVRLEMXTTtf/jcenTDVT cvTrDnnnm? 

VPMGLYIVRTVNEPWTMGFSKSPKKKFFYNKKTK0STFDLPADS 
IAP FHICYYGRLFWEWGDG IRVHDSQKPQDQDKLSKEDVLS FIQ 
MHRA 


S533 


94 


789 


MKERRAPQPWARCKLVLVGDVQCGKTAMLQVLAKDCYPETYVP 
rVFENYTACLETEEORVEt*SLWDT<;f;^PwnwvDDr r>vcncnft»r 

LLCFDISRPETVDSALKKWRTE1LDYCPSTRVLLIGCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI 
HS 1 FRTASM LCLN KPS P LPQKS P VR S LS KRLLHL PS RS EL I S PT 
FKKEKAKXCSIM 


5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
*vj Yiii\iv i urtrtvjrt v i Liijo Jj i Juijc Vj I \j/\i3 r VYPAYAS IK 

AIESPSKDDDTVWLITWVVYALFGLAEFFSDLLLSWFPFYYVGK 
CAFLLFCMAPR PWNGALMLYQRWRPLFLRHHGAVDRI MNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KS FMDSEARLCS LVELS OTQDETQKS DS ENEDLKI DCLQES QEL 

NLOKLKNRRPTT.TPJk.KTkTfMTJCT.TtTWT Y\AVT?rkT Tirri TI/mOtinn rr 

*»*jv*>>wwoci*\«i, ua c»rtivv*vnKc«jLH vr*x lu^t\JiUljlKr»uIKTGNDAK 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQELENKDLSDVAM 
KVKI*QKEFRKK\n5AAJCLiRVQVIjQKKQQDSKiaiASLSIQNEKRAN 
ELEQSVDHMKYQKIQLQRKLQEENEKRKQLDAVIKRDQQKIKVI 
LSYI PAKYNMKC 


5536 


942 


282 


AAATAAS LS PRGCRLRTPS SDVSPSRA^? PPSAAPLPTGRAQMS P 
SGRLCLLTI VGL I LPTRGQTLKDTTS SS S ADAT I MD I QVPTRAP 
DAVYTELQPTSPTPTWFADETPQPQTQTQQLEGTDGPLVTDPET 
HKST KAAH PTDDTTTLSER P S PSTDVQTDPQTLKPS GFHE DDP F 
FYDEHTLRKRGLLVAAVLFITGIIILTSGKCRQLSRLCRNHCR 


5537 


3 

i 


2391 


RARVSSPQLRVFRSGRPRRLRVLRINRTSVALRI.AGTGRFVAKl' 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLLTEHCTBASFQKVISRRHGSCDLENLHLRKRWKREECEG 
HNGCYDEKTFKYDQFDESSVESLFHQQILSSCAKSYNFDQYRKV 
FTHSSLLNQQEEIDIWGKHHIYDKTSVLFRQVSTLNSYRNVFIG 
E KNYHCNNS EKTLNQSSSP KNHQENYF LE KOYKCKE FEE VFLQS 
MHGQEKQEQSYKCNKCVEVCTQSLKHIQHOTIHIRENSYSYNKY 
DKDLSQSSNLRKQIIHNEEKPYKCEKCGDSLNHSLHLTQHQI1P 
TEEKPYKWKECGKVFNLNCSLYLTKOOOTDTGPNLYK'rK'ar'CR'C 

ftrssnlivhqrihtgekpykcioscgkafrcssy'ltkhkrihtg 
e kpy kc ke cgkafnrs s cltqhqtthtg e klykckvcs ks yars 
snlimhqrvhtgekpykckecgkvfsrsscltqhrkihtgbnly 
kckvcakpftcfsnlivherihtgekpykckecgkafpysshli 
rhhrihtgekpykckacsksfsdssgltvhrrthtgekpytcke 
cgkafsyssdviqhrrihtgqrpykceecgkafnyrsyltthqr 
shtgerpykceecgkafnsrsyltthrrrhtgerpykcdecgka 
fsyrsyltthrrshsgerpyxceecgkafnsrsyliahqrshtr 

EKL 


5536 


926 


161 


HS MMMK I P WGS I P vlmlllllgl I dx s qaqls ctg pp ai pg i pg 
ipgtpgpdgqpgtpgikgekglpglagdhgefgekgdpgipgnp 
gkvgpkgpmgpkggpgapgapgpkgesgdykatqkiafsatrti 
nvplrrixjtirfdhvitnmknnyeprsgkftckvpglyyftyha 
ssrgnlcvnlmrgreraqkwtfcdyayntfqvttggmvlkleq 
genvfloatdknsllgmegans r fsgfllfpdmea 


5539 


38 


1258 


hrgpsgaaapgcalprgqalegprscrrpqpmarrydelphypg 
ivdgpaalasfpetvpavpgpygphrppqplppgldsdglkrbk 
deiyghplfpllalvfekcelatcsprdgagaglgtppggdvcs 
sdsfnediaafakqvrserplfssnpeldnlviqaiqvlrfhll 
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wuuu rtL1 ° segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F« Phenyl alanine , G=Glycine, 
H=Histidine, I«Isoleucine, FULysine, 
L-Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Thxeonine, V=valine, 
W*Tryptophan, Y=Tyrosine. X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELEKVHDLCDNFCHR Y I TCIiKGKMPIDIiVI BDRDGGCREDFEDY 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDO^DGLDTSVASPSSGGEDEDLDQERRRNKKRGIFPKVATNIM 
RAWLFQHLSEPYPSEEQKKQLAODTGLTILQVNNWFINARRRIV 

QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL 


p5540 


148 


1440 


PPLGAGAGVHARSPHPARRT.Pr.TTArvrypaDm t oTm-inftu^ — 

PSGAAAPGCALPRGQALBGPRSCRRPOPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKRBKDEI 
YGHPLFP LLALVFEKCELATCS PRDGAGAGLGTPPGGDVCSSDS 
FREDNTAFAKQVRSSRPLFSSNPELDNLMIQAtQVLRFHLLELE 
XGKWP I DL VIEDRDGGCR EDFEDYPAS CPSL POQNNI WI RDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTG LT I LQVNNW F I NARRR I VQ PM I DQSNRTGQGAAFS PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


143 


1440 


rrLA3/\iarto v tiKKa fttf>u<Kijk' JLi'lT AG VGGRAPDLLPT PWRQHRG 
PSGAAAPGCAIiPRGQALEGPRSCRRPQPMARRYDELPr-lYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGH PL FP LLAL VFE KCE LATC S PRDGAGAG LGT PP GGD VCS SDS 
FNEDNTAF AKQVRS E R PL FSSN PE LDNLM I QAI QVLR FHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGS VHLGTPGPSSGGLASQSGDNS SDQGVGLDTS VA5 PSSGGED 
EDLDQEPRRNKKRGI FPKVATN IMRAWLFQHLSH P YPS EEQKKQ 
IAQDTGLTILOVNNWFINAKRRIVQPM2DQSNRTGQGAAFSPEG 
QPIGGYTET3PHVAFRAPASVGDEFGTRKEEWHYL 


5542 


148 


1440 


f v LjWVj Wj, VrtARb PHFARRL PLTTAGVGGRAPDLL PT P WRQH RG 
PSGAAAPGCALPRGQALEGPRS CRRPQPMARRYDEL PH YPG I VD 
G PAALAS FP ETVPAVPG P YG PHR P PQPLP PGLDS DGLKRE KDB I 
YGH P LF PLLAL VFEKCELATCS PRDGAGAGLGTP PGGDVCS SDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHBD 
SGS VHLGTPG PSSGGLASQSGDNS S DCGVGLDTSVAS PSSGGED 
EDLDQEPRRNKKRGI FPKVATNIMRAWLFQHLSHPYPS EEQKKQ 
LAQDTGLTILQVNNWFINARRR1VQPMIDQSNRTGQGAAF3PEG 
QPIGGYTETEPH VAFRAPAS VGDEFGTRKEEWH YL 


5543 
■ 5544 " 


2405 j 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAI? 
KRPFSDSGAFWS PERRPGVLEAPRRR PVPAS FRAVPPKPTRVHG 
S S AS RDRVLARTMI VADS E CRAEL KD YLR PAPGG VGDSGPGEEQ 
RBSRARRGPRGPSAFIPVEEVLREGAESLEQHLGLEALMSSGRV 
viwunruir a or ww.lJtllljlj.LiHi UwPLASSWHHYIAZMAA 
ARHQCS YLVG S HMAE FLQTGGDP E WLLGLH RAP E KLRKLS E I NK 
LLAHRPWLITKEHIQALLKTGEHTWSLAELIQALVLLTHCHSIjS 
SFVFGCGI LPEGDADGS PAPQAPTPPS EQSS PPS RDPLNNSGGF 
ESARDVEALMERMQQLQESIiLRDEGTSQEEMESRFELEKSESLL 
VTPSADILBPSPHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
D YTWEDHG YSLI QRLY PEGGQLLDE KFQAA YS LT YNTI AM HSGV 
DTSVLRRAIWNYIHCVFGIRYDDYDYGEVNQLLERNLKVY I KTV 
ACYPEKTTRRMYNLFWRHFRHSEKVirVNLLLLEARMQAALLYAL 
RAITRYMT 




1895 


514 


LGGLLG RQRLLLRMGAGRLGAPMERHGRASATS VSSAGEQAAGD ' 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVES LRKKR PL FPWFGLD IGGTLVKLVY FEPKDI TAEEEEEEV 
ESLKS I RKYLTSNVAYGSTG I RDVHLELKDLTLCGRKGNLHFIR 
FPTHDMPAFIQMGRDKNPSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSEKCQK 
LPFDLKNPYPLLLVNIGSGVSILAVYSKDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRD1YGGDYBRFG 
LPGWAVAS S FGNNMS KEKREAVS KEDLAR ATLITITNNIGS IAR 
viCALNEW I NQWFVGN.FLRINTXAMRLLAYALD YWSKGQLKALF 
5EHEG Y FGAVGALLELLK I P 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M-Methionine, N-Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
oooenne, i ^ inxreonins, v-vaiine « 
W=Tryptophan, Y=Tyrosine, • X^Unknown, * 3 SCop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 


S54S 


802 


131 


GAMWSAGRGGAAWPVLUJLLLALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLHS! IDI KYGSGSGQQS VTGVEASDDANSY WRIRG 
GSEGGCPRGS PVRCGQAVRLTHVLTGKNLHTHHF PS PLSNNQEV 
SAFGEDGEGDDLDLWTVRCSGCHWEREAAVRFQHVGTSVFLSVT 
GEQYGS P IRGQHEVHGM PS ANTHNT W KAMEG I F I KPS VE PSAGH 
DEL 


5546 


1592 


146 


FVP RGGHSS MGQ SGRS RHQ KRARAQAQLRN LEA YAANPHS FVFT 
RGCTGRNIRQLSLDVR R VM EPLTAS RIiQ VR K KNS LKDCVAVAG P 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEO^FAHPPLLVIaNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNT I KRCLL ID YNPDSQELDFRHYS I KWPVGASRGMK 
KLLQEKFPNMS RLQDISELLATGAGLSESEAEPDGDHN ITELPQ 
AVAGRGNMRAQQS AV RLTE I GPRMTLQL I KVQ EG VGEGKVMFHS 
FVS KTEEELQAI LEAKEKKLRLKAQRQAQQAQNVQRKQEQRE AH 
RKKSLEGMKKARVGGSDEEASG2PSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQ KF PXTKD KSQG AQARRG P RGAS RDGGRGRGRGR PG KR VA 


5547 


1592 


146 


FV P RGG HSS MGQSG RS RHQ K RARRQAQ LRNL E AY AAN PH S FVFT 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 
LGVTH FLILS KTETNVY PKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQFAHP PLLVLNSFGPHGMHVKLMATMFQNLFPS I 
NVH KVNLNTI KRCLL1DYNPD SQELDFRHYS I KWPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQS A VRLTE I G PRMTLQLJKVQEGVGEGKVMFHS 
FVS KT EE ELQAI LEAKE K KLRLKAQRQAQQAQNVQRKQ EQRE AH 
RKKSLEGMKKARVGGSDEEASGIPSRTA55LELGEDnnEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


■ 5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPLARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 
RKQRICAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEEIR1RSADDCKQFREEFNSLPSGHIQGTFELANKEEN 
REKNRYPNILPI7DHSRVILSQLDGIPCSDYINASYIDGYKEKNK 
FI AAQGPKQETVNDFWRMVWEQKSAT I VMLTNLKERKEE KCHQY 
WPDQG CWTYGN I R VCVEDCWLVDYT I R KFC I QPQLPDGCKAPR 
LVSQLHFTSWPDFGVPFTPIGMLKFLKKVKTLNPVHAGPIWHC 
SAGVGRTGTFIVIDAMMAMMIJAEQKVDVFEFVSRIRNQRPQMVQ 
TDMQ YTF I YQALLE YYLYGDTELDVS S LEKHLQTMHGTTTHFDK 
IGLEEEFRKLTNVRIMKENMRTGNLPANMKKARVIQIIPYDFNR 
VI LSMKRGQEYTDY INAS FI DGYRQKDYFIATQGPLAHTVEDFW 
RMIWEWKSHTIVMLTEVQEREQDKCYQYWPTEGSVTHGEITIEI 
KNDTLS EAIS IRDFLVTIiNQPQARQEEQVRWRQFHFHGWPE IG 
I PAEG KGM IDLI AAVQKQQQQTGNHP ITVHCSAGAGRTGTF IAL 
SNILERVKAEGLLDVFQAVKSLRLQRPJIMVQTLEQYEFCYKWQ 
DFIDI FSDYANFK 


5549 


915 


256 


FEATGGKRLAFKMAGTAIiHDREMAIQAKKKLTTATDPIERLRLQ 
CLARGS AG I KGLGRVFR IMDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTIDFNEFLLTLRPPMSRARKEVIMQAF 
RKLDKTGDGVITI EDLREVYNAKHHPKYQNGEWS EEQVFRKFLD 
NFDS P YD KDGLVT PEE FMN YYAGVSAS IDTDVY F 1 1 MMRTAWKL 


5550 


2364 


1210 


R KR KV FL KMRRLN RKKTLS LVKE LDAFP KVPES YVETS AS GGT V 
SLIAFTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
TVAWKCQYVGADVLDLAETMVASADGLVYEPTVFnr^PQQKEWQ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYVNKVAGNFTIITVGKAIPHPRGHAHLAALVNHESYN 
FSHRI DHLS FGBLVPAI IN PLDGTEK I AIDHNQM FQ Y FI TWPT 
KLHTYK IS ADTHQ FS VTERER I I NHAAGSHGVSG I FMKYDLSSL 
MVTVTEEHMPFWQFFVRLCGIVGGIFSTTGMLHGIGKFIVEIIC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 


211 


1700 


MQREHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE 
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H Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D^Aspartic Acid, 2= 
Glutamic Acid, F=Phenylalanine, G=Glyciner 
H=Histidir.e, I=Isoleucine, K=Lysine # 
L=Leucine, M=Methionine, N^Asparagine, i 
P^Proline, Q=Glutamine, R=Arginine, ! 
S^Serine, T»Threonine, V*=Valine, 
W«=Tryptophan, Y-Tyrosine, X=»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
1 \=possible nucleotide insertion) 1 


5552 






W FVF RR YAE FD KLY NTL K KQ F PAMALK I PAKRl FGDNFOD P~DFIK f 

ORRAGLNEFIQNL^/RYPELYNHPDVRAFLQMDSPKHQSDPSEDB 

DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 

liakrkldgkfyavkvlqkkivlnrkeq:<himaernvllknvkh 

PFLVGLHYSFO/TTEKLYFVLDFVNGGELFFKLQRERSFPEHRAR 
1 FYAAElASALQYLHSIKIVYRDLKPENILLDSVGHWLTDFGIiC 
K EG I A I S DTTTT FCGT P EY LAP E VI R KQP YDNT VDWWCLG AVL Y 
EML YGLP P F YCRDVAEM YDN I LH KPLSLR PGVS LTAWS I LBELL 
EKDRQNRLGAKEDFLEIQNHPFFESLSWADLVQKKIPPPFNPNV 

AGPDDIRN7DTAFTBETVPYSVCVSSDYSIVNASVLEADDAFVG 
| FSYAPPSEDLFL | 


" 5553 " 


2748 


930 


IXjPAAGAAMGKXHKKHKAEWRSSYEDYADKPIiEKPLKLVLKVGG" 1 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
PVRACRTQPAENESTPIQQLLEHFLRQLQRKDPHGFFAFPVTDA 
IAPGYSMIIFCHPMDFGTMKDKIVANEYKSVTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKILHAGFKMMSKQAALLGNBDTAVEEPVP 
EWPVQVETAKKSKKPSREVISCMFEPEGNACSLTDSTAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLLYSWNTAEP 
DADEEETHPVDLSSLSSKLLPGFTTLGFKDERRNKVTFLSSATT 
AliSMQbJNSVFGDLKSDEMELLYSAYGDBTGVQCAXiSLQEFVKDA 
GSYSKKWDDLLDQITGGDHSRTLFQLKQRRWVPMKPPDEAKVG 
DTLGDSSSSVLEFMSMKSYPDVSVDISMLSSLGKVKKELDPDDS 
| HLNLD ETT KLLQ DLHEAQAERGGS RPSSNLSS LSNASERDQHHI/ 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


S*54 


74 


1095 


IiGREAVYLVS RMDG PVAEHAKQEPFHVVTPLLESWAliS 0VAGMP 1 
VFLKCENVQPSGSFKIRGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLG I PAT 1 VL PES TS LQWQRhQG EGAE VQLTG JCVWD 1 

EANLRAQELAKRDGWENVPPFDHPLIWKGHASLVQELKAVLRTP 
PGAL VLAVGGGG t»LAG WAGLLE VGWQH VP 1 I AMETHGAH CFNA 
A I TAG KL VTL PD I TS VAKS LG AKTVAARAL ECMQVCK1 HS E WE 
DTEAVSAVQQLLDDERMLVEPACGAAIAAIYSGLLRRLQAEGCIi 
PPSLTSVWIVCGGNNINSRELQALKTHLGQV ~ | 




166 


2318 


CS GRTGGRG S LR PAENV CLTCK L S GAE TRGLLC PALRTWIMK VH J 
GRS FFWVLF PVLPWAVQAVEHEE VAQR VI KLHRGRGVAAMQS RQ 
WVRDSCRKLSGLLRQKNAVLNKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFE I FQKF.LNESENS VFQAVYGLQRALQGDYKD WNMKESSR 
QRLEALREAAI KE ETEYMELLAABKHQVEALKNMQHQNQSLSML 
DEILEDVRKAADRLEEBIEEHAFDDNKSVKGVNFEAVLRVEEEE 
ANSKQNITKREVEDDLGLSMLIDSQNNQYILTKPRDSTIPRADH 
HFIKDIVTIGMLSLPCGWLCTAIGLPTMFGYI ICGVLLGPSGLN 
SIKSIVQVETLGEFGVFFTLFLVGLEFSPEKLRKVWKISLQGPC 
YMTLLMI AFGLLWGHLLRI KPTQSVFI STCLS LSS TPLVSRFLM 
GS ARGDXEGD I D YS TVLLGMLVTQDVQLGLFKAVM PTL IQAGAS 
ASSS I WEVLR I LVLIGQI LFSLAAVFLLCLVI KKYLIG PYYRK 
LHMES KGNKEILI LGISAFI FLMLTVTE LLDVS ME LG CFLAGAL 
VSSQG P WTEE I AT S I E P I RDFLAI VFFAS IGLHV F PTFVA YE L 
TVLVFLTLS VWMKFLLAALVLS LI LP RS SQY I KW I VS AGLAQ V 

S3FSFVLGSRARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 
TRCVPRPERRSSL 


5555 1 
555^ [ 


212 
£835 


1425 j 
3346 1 


L y LRTRE T PAP P RCEAASQGR VG WRAD AAAEEAVRS V WNRTR DR I 
GTMAPQWLSTFCLLLLYLIGAVIAGRpFYKILGVPRSASIKDlK 

Kax kkJjAXjQLHFDRN pddpqaqe kfqdlgaayevls dsekrkqy 
DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMFGGTPROQDRNIPR 

gsdiivdlevtleevyagnfvewrnkpvarqapgkrkcncrqe 
mrtrqlgpgrfqmtqevgcdecpnvklvneertleveiepgvrd 

3MEYPFIGEGEPHVDGEPGDLRFRIKWKHPIFERRGDDLYTNV 
riSLVESLVGFEMDITHLDGHfCVHISRDKITRPGAiakKKGEGL 
J^^IKGSLIITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 

3TRGMSKNCVPME FEE YbLRM FQGTFYLLQKI TKDNNAHTVKS R 
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mux .iw ai»xu ocy menu containiny signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F« Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, M=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S-Serinc, T=Threonine , V^Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LEEIiDESYIEKFTDFI*RLFVSVHLRRIESySQFP\n/EFLTLLFK 
YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLNRYE 
DALVLLLTEVLNRIQFRYNQAQLEELDDETLDDDQQTEWQRYLR 
QSLEWAKVMELLPTHAFSTLFPVLQDNLEVYLGLQQFIVTSGS 
GHR LN ITAEND CRRLH CS LRDLS S LLQA VGRIiAEYFIGDVFAAR 
FNDALTWBRLVICVTLYGSQIKLYNIETAVPSVLKPDLIDVHAQ 
SIJUUXJAYSHWLAQYCSEVHRQNTQQFVTLISTTMDAITPLIST 
KVQDKLLLSACHLL VS LATTVR PVFLI S I PA VQKVFNR I TDASA 

LRIiVTJTf JVOVTA/PUnT CKfTT.T T OUDVIT OE , Ml?/'V\Mr»i?oe»T*Titi»f-.» ▼ 
mxvuv wMiy v u v uh/vuoim i louitV WFNljPr^lis^WPVKSlNHASLI 

SALSRDYRNLKPSAVAPQRKMPLDDTKLIIHQTLSVLEDIVENI 

SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDEMIiSFFL 

TLFRGliRVQMGVPFTEQIIQTFLNMFTREQLAESILHEGSTGCR 

WEKFLKILQVWQEPGQVFKPFLPSIIALCMEQVYPIIAERPS 

PDVKAELFELLFRTLHHNWRYFFKSTVZiASVQRGIAEEQMENEP 

QFSAIMQAFGQSFLQPDIHLFKQNLFYLETLNTKQKLYHKKIFR 

TAMLFQFVNVLLQVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 

AFLP2FLTS CDGVDANQKS VLGRNFKMDRVRRERGRAKRRAEWA 

R K PGTCAARRG H I EAS GRGL CP PCSLAAAHEM P ADLVL 


5557 


1712 


491 


v j, uvjMiiuniJivijnvvi r v vtjijfc'KKljKIjS/VLJUaAGRFCILGSEAATR 
KHLPARNH CGL5DS S PQLWPEPDFRN PPR KASKASLDFKR YVTD 
RRLAETLAQIYLGKPSRPPHLLLECNPGPGILTQALLEAGAKW 
ALESDKTFIPHLESLGKNLDGKLRVIHCDFFKLDPRSGGVIKPP 
AMS S RGL F KN LG IE A V PWTADI PLXWGM FPSRGEKRAL WXLA Y 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSV1 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRELLDQLQQKLY 
LIQMI PRQNLFTKNLTPMNYNI FFHLLKHCFGRRSATVIDHLRS 
LTPLDARDILMQIGKQED3KWNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 


RAGC7-HPQVPADLGAPAEPRRPQKTCVCLLQPQPGG0RGPTTMI 
a Kf v r o nivLm i *? vw»vijJ.ol^i v_linyiu£vAuAELQEADGQCPVDRS 
HiKLKMVQWFRHGARS PLKPLPLEEQVEWN PQLLEVPPQTQFD 
YTVTNLAGGPKPYS PYDSQYHETTLKGGMFAGQLTKVGMQQM FA 
LGERLRKNYVEDIPFLSPTFNPQEVFIRSTNIFRNLESTRCLLA 
GLFQCQKEGP 1 1 1HTDEAD3EVLYPNYQSCWSLRQRTRGRRQTA 
SLQPG ISEDLKKVKDRMGIDSSDKVDFFILLDNVAAEQAHNLP5 
CPriLKRFARMIEQRAVDTSLYILPKEDRESLQMAVGPPLHILES 
NLLKAMDSATAPDKIRKLYLYAAHDVTFIPLLMTLGIFDHKWPP 
FAVDIiTMELYQHLESKEMFVQLYYHGKEQVPRGCPDGLCPIiDMF 
LNAMS VYTLS PE KYHALCS QTQVMBVGNEE 


5559 


ISO 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSIiLETLSPEEMEELEK ' 
ELDWDPDGSVPVGLRQRNQTEKQSTGVYNREAMLNFCEKETKK 
LMQREMSMDESKQVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KEPKRGGLKKSFSRDRDEAGGKSGEKPKEEKIIRGIDKGRVRAA 
VDK KE AGKDG RG E ERA VATKKE ER K KG S DJ3 MTfi T . <! 3 nvn v v d p c 

MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVKKNEPLHEKEAKDDSKTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPSXFDEPLERVKNNDPEMTSVNVNNSDC 
ITNEILVRFTEALEFNTVVKIjFALANTRADDHVAFAIAIMLKAN 
KT I TSLNLDSNH I TG KG I LAI FRALLQNNTLTE LRFHNQRH I CG 
GKTEMEIAKLLKENTTLLKLGYHFELAGPRMTVTNLLSRNMDKQ 
RQKRLGEQRQAQEAKGEKItDLLE VP KAGAVAKGS P VP QDDDCOv 

PSPKNS PKKGGAPAAPPPPPPPLAP PL I MENLKNSLSPATQRKM 
GDKVL P AQEKNSR DQLLAA I R SSNL KQL KKVE VP KLLQ 


5560 


9 


921 


SSWEFSALSVSMACLSPSQLQKFQQDGFLVLEGFLSAEECVAM 
QQRIGEIVAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFLSSGDK 
IRFFFEKGVFDEKGNFLVPPEKSINKIGHALHAHDPVFKSITHS 
FKVQTLARSIiGLQMPWVQSMYIFKQPHFGGEVSPHQDASFLYT 
EPLGRVLGVWIAVEDATLENGCLWFIPGSHTSGVSRRMVRAPVG 
S APGTS FLGS E P ARDNS L F VPTP VQ RG ALVLIHGEWHKS KQNL 
SDRSRQAYTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


5561 


2175 


177S 


CYFIFQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rieuicLec ena 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
nanisciaine, isjsoleucine, K=Lysme, 
L=Leucine, Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) ! 








QLLAPTYPSAPGVKNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
w IUOVi * J ^^^uwuvy^iv^&FFRKTPOPVT-.KPPPPEVVSRGS 

s 


5562~ 


342 


1385 


SSGKNOMAAAGAAGLVRGLKAGVLSQADYLNLVQCETIiEDLKLH 
L<3S7DYGNFLiANEASPL'I*VSVIDDRLKEKMWEFRHMRNHAYEP 
LASPLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 
QMEAVNIAOTPABLYNAILVDTPLAAFFODCISEQDLDEMNIEI 
I RNTL Y KA YLES FY KFCTLLGGTTADAMCP I LE FEADRRAF I IT 
INS FGTELS KEDRAKLPPHCGRLYPEGLAQLARADDYEQVKNVA 
ux x PbiKIjliFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRN1VWIAECIAQRHRAKIDNYIPIF 


5563 


342 


1385 


ssgkndMaaagaaglvrgi^vlsqadyi^lvqcetlbdlklh 

I^STDYGNFLAMEASPLTVSVIDDRLKEKKVVEFRHMRNHAYEP 
LASFLDFITYSYMIDNVILLITGTLHQRSIABLVPKOIPLGSFE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDEMNIEI 
IRNTLYXAYLESFYKFCTLLGGTTADAMCPILEFEADRRAFIIT 
INS FG TELS KEDRAKLFPHCGRIi YPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFLNQFHF 
GVFYAFVKLKEQECRN1VWIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGLALIiL 
LLGLG LGLEAAAS PLSTPTSAQAAGPSSGSCPPTKFQCRTSGIjC 
VPLTWRCDRDLDCSDGSDEEECRIBPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKliRNCSRIACrAGBLRCTLSDDCIPLTWRCDGH 
PDCPDSSDELGCGTNEHiPEGDATTMGPPVTLESVTSLRNATTM 

gppvtlesvpsvgnatsssagdqsgsptaygviaaaavlsaslv 

TATHiLbSWLRAQERLRPLGLLVAMKESLLLSBQKTSLP 


5565 


993 


138- 


kwnspnparagsisrpqrapgsvsavamtaavffgcafiafgpa 
lalyvft 1 ate plri i pli agaffwlvsllisslvwfmarvi 1 d 

NKDGPTQKYLLIFGAFVSVYIQEMFRFAYYKLLKKASEGLKSIN 
PGETA PSMRLLAYVSGLG FG I MSGVFS F VNTLS DS LG PGT VG I H 
G DS PQ FFLYS A FMTLV 1 1 LLHVFWG IVFFDGCE KKKWG I LL I VL 

LTHLIiVSAQTFISSYYG INLASAFI IIiVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


s H IQHHG RG AQAP VKM VS WM I S RAWLVFGMLY P AY Y S YKAV KT ' 

KWKEYVRWPWYWIVFALYTVIETVADQTVAWFPLYYELKIAFV 

IWLLSPYTKGASLIYRKFLHPLLSSKEREIDDYIVQAKERGYET 

MVNFGRQGLNLAATAAVTAAVKSQGA1 TERLRSFSMHDLTTIQG 

DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 

EGPYSDNEMUTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 

RPQVYF 


5567 


1554 


233 


t*. t Lrfobt, Vi> VVLiANEDGUl AJjJiycX'IDDFREMVQQLLEAGANI NA 
CDSECWTPLHAAATCGHLHIiVELLIASGANriLAVNTDGNMPYDL 
CDDEQTLD CLETAMADRG X TQDS IEAARAVPELRMLDDIRSRLQ 
AG ADLHAPLDHG ATLLHVAAANG FSE AAALIiLEHRAS I>S AKDQD 
GW E P LHAAAYWGQ VPLVELL VAHGADLNAKS LMD ET PLDVCG D E 
EVRAKLLE LKH KHD ALGRAQS RQRSLLRR RTSS AGS RGKWRR V 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTGAELR 
P P P P EEDNP EWR PHNG RVGGS P VRHLYS KR LDRS VS YQLS P l»D 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETABP 
GLPGDTVTPQPDCGFRAGGDPPLLKIiTAPAVEAPVERRPCCLLM 


5568 
5569 


1731 
2 


587 
835 


AEDRQPASRRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL " 
SLIjVSGPR LF LLQQ PLAPSGLTL KS E ALRN WQ VYRI»VT Y I FVY E 
NP I S LLCG A 1 1 1 WR FAGNFERTVGTVRHCFFTVI FAI FS AI I FL 
S FEAVSS LS XLGEVEDARGFTP VAFAMLGVTTVRSRMRRALVFG 
MWPSVLVPWLLLGASWLIPQTSFLSNVCGLSIGLAYGLTYCYS 
I D LS ER VAL KLDQTF P FSLMRR I S VFKYVSGSS AER RAAQS RKL 
NP VPGS YP TQS CH PHLS PSH P VSQTQHASGQ KLAS WP S CT PGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIGPPTP 
VNS PGTVY SGALGTPGAAGSKESSRVPMP 

QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG ~| 
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SEQ- 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rteuicteo enu 

nucleotide 
location 
co rr e s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L^Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 

jc* ajic, J. — x lit CUIJ j. ilc , v = Vel-I ZJie , 

W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








LKLL LLL LL L P LRGQ ANTGC YG I PGM F GLPGAPG KDG YDGL PG P 
KGEPGIPAIPGIRGPKGQKGEPGLPGHPGKNGPMGPPGMPGVPG 

*' wAruoruocvjRi t\v^ *vr \jct v r X V iKvJl *iUrirAPNSLiIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
VVTFCGHTSKTNQVNSGG VLLRLQVGEEVWIjAVNDYYDM VG IQG 
SDSVFSGFLLFPD 


! 5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSS PSPGKRRMDTDWKLI ESKHEVTI LGGLNE FVVKF YGPQGT 
ric.\j^vw[\VKVL»t>rL>Ai Fr KoFijJLUrrlNKJF HPNIDEASGTVCL 
DVINQTWTALYDIiTNIFESFbPQLLAYPNPIDPLKGDAAAMYLH 
RPEBYKQKIKBYIQKYATEBALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGG VATSTEEPAR PRAPQSRGPG PVSQTGRGRERGGGDT 
MSS PSPGKRRMDTDWKLI E S KH EVT I LGGLNEFWKF YGPQGT 
v l fcULJ V W K.V KVDIj PDK Y P FKS PS I G FMNKI FH PN I DEASGTVCL 
DVINQTWTALYDLTN I FES FL PQLLA Y PNP I DPLNGDAAAM YLH 

RPEBYKQKIKEYIQKYATEEAJjKEQEEGTCJDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGIPGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSESPG " 
DAGAAAEGGGWAAAAIALLTGGGEMLLNVALVALVLLGAYRLWV 
RWGREGLGAGAGAGEESPATSLPRMKKRDFSIiEQLRQYDGSRNP 
R I LLAVNGKV FDVT KGS KF YG PAG P YG I FAGRDAS RGLAT FCLD 

KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKOD 


5^73" 


2562 


219 


VPARTPNAEDWPEARAATATPCQSGdRERAGEAAEDGVKMAAF 
SEMGVMPSIAQAVEEMDWLLPTDIQAESIPLILGGGDVIiMAAET 
GSG KTG AFS I P VI Q I VY ETLKDQQ EG KXGKTT I KTGAS VLN KWQ 
MNP YDRGS AFA I GS DGLC CQ S REVKEWHGCRATKGLMKGKH Y YE 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNKQFD 
NYGEE FTMHDT IGCYLDIDKGHVKFS KNGKDLGLAFE I PPHM KN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALSKAPDGYIVK 
S3HSGNAQVTQTKFLPNAPKALI VEPSR BLAEQTLNNI KQFKKY 
I DNPKLRE LL 1 1 GG VAARDQ LS VLENGVDI WGTPGRLDDLVS T 
G KLNL S Q VRFL VXD EADGLLSQGYSD F I NRMHNQ I PQ VTS DGKR 
LQVIVCSATLHSFDVKKLSEKIMHFPTWVDLKGEDSVPDTVHHV 
WP VN P KTDRLWER K3KS HI RTDDVHAKDNTR PG ANS PEM WS EA 
IKILKGEYAVRA1KEHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 

i\uiuj,nuv t» i viw v Luiruii)\\JN x VnKivaiRVCsKAJiKMGljAISLVA 

TEKEKVX^YHVCSSRGKGCYNTRLKEDGGCTIWYNEMQLLSEIEE 
HLNCTISQVEPDI KVPVDEFDGKVTYGQKRAAGGGS YKGHVDI L 
APTVOELAAL E KE AOTS FLHI /5 YT .PNOT . v ott? 


5574 


1731 


952 


NEGLEVFKEQELQPSDKGAVPEDASTERSAMASLGLQLVGYIliG 
LLGLLGTLVAMLLPS WKTSS YVGAS I VTAVGFS KGLWMECATH S 
TGITQCDI YSTLLGLPAJDIC2AAQAMMVTSSAISSLACI ISWGM 
RCTVFCQESRAKDRVAVAGGVFFILGGLLGFIP VAWNLHG I LRD 
FYSPLVPDSMKFEIGEALYLGIISSLFSLIAG1ILCFSCSCQRN 
RSNYYDAYQAQPLATRSS PRPGQPPKVKSEFNS YSLTGYV 


5575 


456 


766 


LLWALPCP PPTAAAVLLSSTGLMELLBKMLALTLAKADS PRTAL 
LCSAWLLTASFSAQQHKGSLQKDPLLSQACVGCLEALLDYLDAR 
SPDIGRNSPHYLMFP 


5576 


249 


2146 


RSWGAPWFWPj^LURR^MPLRI^VGC^FV^FLFtLHRDVSSR 
EEATEKPWLKSLVSRKDHVXDLMLEAMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGPYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DQKFRRCPPLATTSVIIVFHNEAWSTLLRTVYSVLHTTPAILLK 
E I ILV13DASTEBHLKEKLEQYVKQLQVVRVVRQEERKGLITARL 
LGASVAQAE VLT FLD AHCE CFHGWLE PLLARI ABDKTVWS P DI 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWETLPPHEKQRR 
KDBTYP I KSPTFAGGLFS I S KS YFEHIGTYDNQMB I WGGENVEM 
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3EQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(AsAlanine, C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
Ij= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutaraine, R=Arginine, 
S-Serine, T=Threonine, V«Valine, 
W«Tryptophan, YaTyroaine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLEI I PCS WGHVFRTKS PHTFPKGTSVIARNQVR 
LAEVWMDS YKKI FYRRNLQAAKMAQEKS FGDISBRLQLRBQLHC 
HNFSWYLHNVYPEMFVPDLTPTFYGAIKNLGTN0CLDVGKNJ7RG 
GKPHMYSOIGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKGAIjG 
LGSCHFTGKNSQVPKDEEWEIAQDQLIRNSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCSCGEISVHCLPWVLFILDLKVBSSMFCPLKLILLPVLLD 

ysi^i^dlnvsppeltvhvgdsalmgcvfqstedkcifkidwtl 
spgehakdeyvlyyysnlsvpigrfqnrvhlmgdilcndgslll 
qdvqeadqgtyicbirlkgesqvfkkavvlkvhpbepkslmkv 
ggllqmgcvfqstevkhvtkvewifsgrrakeeivfryyhklrfl 
sveysqswghfqnrvnlvgdifrndgsimlqgvresdggnytcs 

I HLGNLV F KKTIVLHVS PEEPRTLVTPAALR PLVLGGNQLV 1 1 V 
G I VCAT I LLLPVL1 LI VKKTCGNKSS VNSTVLVKNT KKTN PE I K 
EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMKPV 
MPS LRSDRNNS LEKKSGGGMP KTQQAF 


"5578 " 


3 


783 


A VES MAS PG AC RA P P EL P ERN CG YRE VE Y W DQRYQGAADS AP YD 
WFGDFSS FRALLE PELRPEDR I LVLGCGNSALSYE LFLGGFPNV 
TSVDYSSWVAAMQARYAHVPQLRWETMDVRKLDFPSASFDWL 
EKGTLDALLAGERDPWTVSSEGVHTVDQVLSEVSRVLVPGGRFI 
SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LS VAQ LiALG AQ I LS P PR P PTS PC FLQDSDHED FLS AI QL 


5579 


3 


" 1540 " ~ 


RNSGLARGASAlJ\RHGGGliAGGVGWDCGACASRCQGVMEGLLTR 
CRALPALATCSRQLSGYVPCRFHHCAPRRGRRLLLSRVFQPQNL 
REDRVXjSLODKSDDLTCKSORLMLOVfltiTYPaQDnr'VHT t Dvnr 

RAM E KLVR VI DQEMQAI GGQKVNM PS LS P AEL WQ ATNRWDLMG K 
ELLRLRDRHGKS YCLG PTHEEAI TALI AS QKKLS YKQL P FLLYQ 
VTRKFRDEPRPRFGLLRGRKFYMKDMYTFDSSPEAAQQTYSLVC 
DAYCSLFNKrjGLPFVKVQADVGTIGGTVSHEFQLPVDlGEDRLA 
I CPRCSFSANMETLDLSQMNCPACQG PLTKTKGI EVGHTF YLGT 
KYS S I FNAQ FTNVCG KPTLAEMG CYG LG VTR I LAAA I E VLSTED 
CVRWPSLLAPYQACLIPPKKGSKBQAASELIGQLYDHITEAVPQ 
LHGEVLLDDRTHLTIGNRLKDANKFGYP FVI I AGKRALEDPAHF 
EVWCQNTGBVAFLTKDGVMDLLTPVQTV 


5580 


1681 


450 


ADAGTRCIPGFVVPSGAGYSAPAr>RrcT?R't;qr , kM3&hniyDr'T tbh — 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
GPSRYVLGMQELFRGHSKTREFLAHSAKVHSVAWSCDGRRLASG 
S FDKTAS VFLLE KDRLVKENN YRGHGDSVDQLCWIIPSNPDLFVT 
ASGDKTIR I WD VRTTKCIAT VNTKGEN IN I CWS PDGQT I AVGNK 
DDWTFI DAKTHRSKAEEQ FKFEVNE I S WNNDNNMFFLTNGNGC 
INILSYPELKPVQSINAHPSNCICIKFDPMGKYFATGSADALVS 
LWDVDEL VCVR C FSRLDWPVRTLS FSHDGKMLASASEDHF IDIA 
EVETGDKLWEVQCESPTFTVAWHPKRPLLAFACDDKDGKYDSSR 
EAGTVKLFGLPM)S 


' 5581 


<*4 " - 


947 


GGGSG P RAP S ATL LDTG ES VAAVASGE D KG I AAS AAAAAV FACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSLYPTNS PS YAPEFQFLHSAYATLLMKOAWPONSSS CGT2G 
TFH L P VDTGTENRT YQAS S AAFRYTAGT P YKV P PTQS NT AP P P Y 
SPS PWP YQTAM YP I RS A YPQQ WL YAQGA YYTQ PVYAAQPH VI HH 
TTWQ PNS I PS A I Y PAPVAAPRTNG VAMGMVAGTTMAMS AGTL L 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


5582 


5775 


2739 


I ITNNNNVI IPLVIAYHLSGSAQARGERSPAERLMERQKRKADI ' 

EKGLQFIQSTLPLKQEEYEAFLLKLVQNLFAEGNDLFREKDYKQ 

ALVOYMEGI^VADYAASDQVALPRELLCKLHVNRAACYFTMGLY 

EKALEDSEKALGLDSESIRALFRKARALNELGRHKEAYECSSRC 

SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 

TAAGVADQGTSNGLGS IDDI ETDCYVDPRGSPALLPSTPTMPLF 

PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSL 

VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 

ASFG LVTiD PS KKLAAS VLDALD ? PGP TLD P LDLL P YS E TRLDAL 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid h pompn t~ rnnhaininn c { ^nai nor»»- •{ JJ^ "" 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, Tr-Threonine , V=Valine, 
W-Tryptophan, Y-Tyrosine, X«Unknovn, **Stop 
Codon, /=possible nucleotide deletion, 
\=poasible nucleotide insertion) 








DS FGSTRGSLDKPDS FMEETNSQDHRP PSGAQKPAPS PE PCM PIT" 

TALLIKNPLAATHEFKQACQLCYPKTGPRAGDYTYREGLEHKCK 

RDI LLGRLRSSEDQTWKRIRPRPTKTS PVGSYYLCKDMINKQDC 

KYGDNCT FAYHQEE IDVWTEE R RGTLNRDLLFDPLGG VKRG S LT 

IAKLLKEHQGIFTFLCEICFDSKPRIISKGTKDSPSVCSNLAAK 

HS F YNN KCLVH I VRSTSLKYS K I RQPQEHFQFDV CRHE VR YGCL 

REDSCHFAHSPIELKVWLLQQYSGMTHEDIVQESKKYWQQMEAH 

AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGCWEPDKDLK 

YCSAKARHCWTKERRVLLVMSKAKRKMVSVRPLPSIRNFPQQYD 

LCXHAONGRKCQYVGNCSFAHSPEERDMWTFMXSNKILDMQQTY 

DMWLKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 

CGKNSNSKKQWQQHIQSEKHKEKVFT3DSDASGWAFRFPMGEFR 

LCDRLQKQKACPDGDKCRCAHGQEELNEWLDRREVIjKQICLAKAR 

KDMLIiCPRDDDFGKYNFLLQEDGDliAGATPEAPAAAATATTGE 


5583 


3 


1265 


IKKAYRKIALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGS PS FSSPMDI FDMFFGGGGRMARERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVIC3KCEGVGGKKGSVEKCPL 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVIREKKIIEVHVBKGMKDGQKILFHGEGDQEPELEPGDVI 
IVLDQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRI 
LVI TSKAGEVI KHGDLRCVRDEGMP I YKAPLE KGIL I IQFLVI F 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


1265 


a o v- 1\ yva r tr\3 tw u tift*. f tf f K rtrl AI*J V axi 1 K X xUJ.u\JVKF£>ASPEE 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMD I FDMFFGGGGRMARERRGKNW 
HQhSVTLEDLYNGVTKKhAhQKNVlCEKCEGVGGKKGSVEKCPh 
CKGRGMHIHI0QIGPGMVQQIQTVCIECKGQGERINPKDRCE5C 
SGAKVIREKKIIEVHVEKGMXDGQKILFHGEGDQEPELEPGDVI 
IVlibQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRI 
LVI TS KAGE VI KHGDLRCVRDEGMPI YKAPLE KG I LI IQ FLVI F 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


j 5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHS LTY ATI LEMQ AMMTFDPQDI LLAGNMMKEAQ'MLCQRH RR KS 
SVTDSFSSLVKRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSFI KGGI KVRNSYOTYKEUO*; ~L.Vn*ZQn'vrKn'?Xnimiwrn 

VKLG VGAFNLTLSML PTR ILRLLE FVGFSGNKDYGLLQLE EGAS 
GHSFRSVLCVMLLLCYHTFLTFVIXSTGKnmiEEAEKLLKPYLNR 
YPKGAI FLFLAGRI E VI KGNIDAAIRR FEECCEAQQHWKQ FHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITX 
AEEKLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANE KK I KYDH Y L I PNALLELALLLMEQDRNEE A I KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLBNSSRSMVSSVSL 


558<J 


2619 " 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKEAQMLCQRHRRKS 
S VTDS FSSLVNR PTLGQFTEEE r HAE VCYAKCLLQRAALTFLQD 
ENMVS FIXGGIKVRN^ YnTVKFT.nQT.Vn^ QOVr'Krjp ktwdwpc" nr» 
VKLGVGAFNLTLSMLPTR I LRLLE F VG FS GN KD YG LLQ LE EGAS 
GHS FRS VLCVMLLLC YHTFLTFVLGTGNVN I EEAEKLLKP YLNR 
YPKGA I FLFLAGRI EVI KGNIDAAI RRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFODDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLIPNALLELALLLMBQDRNEEAIKLLESAKQ 
NYKNYSMESRTHFR1QAATLQAKSSLENSSRSMVSSVSL 


■"5587 


176B ■ 


140 


SSAVPDGAVGRPVAVAVGGPPHS CRCRPCCLMAAIGVHLGCTSA " 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino an' H RMinpnh fnn Ha i n 1 ^ -I 1 i 

muxiiv/ a^iu oc«juR.nt (.uncainiiic[ sxgnaj. pepcioe 
(A=Ala.iine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I=»Isoleucine, K«Lysine, 
LaLeucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine r 
S^Serine, T-Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown f *=Stoo 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAGWANDAGDRVTPAWAYSENBEIVGLAAKQSRI 
RNI SNTVMKVKQI LGRSS SD PQAQKYI AESKCI*VI EKNGKLR YE 
I DTGEETKFVK PEDVARL I FS KMKETAHSVLGSDANDVVITVPP 
DFGEKQKNALGBAARAAGFNVLRLIHEPSAALLAYGIGQDS PTG 
XSNILVFKLGGTSLSLSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
TIAQYLASEFQRSFKHDVRGNARAMMKLTNSAEVAKHSLSTLGS 
ANCFLDS LYEGQDFDCNVS RARFELLCS PLFNKCI EAI RGLLDQ 
NGFTADD I NKWLCGGSS R I P K LQQ L I KDL FP AVE LLNS I P PDE 
VIPIGAAIEAGILIGKENLLVEDSLMIECSARDILVKGVDESGA 
S RFTVL F PSGT P L P ARRQHTLQ APG S I S S VCLELY ES DG KNS AK 
EETKFAQVVLQDLDKKENGLRDILAVLTMKRDGSLHVTCTDQET 
GKCEAISIEIAS 


5588 


3 


589 


TPPPPEQAMVAATVAAAWLLLWAAACAQQEQDF YDFKAVN IRGK 
LVSLEKYRGS VSLWNVAS ECG FTDQH YRALCX)UJRDU3PHHFN 
VLAFPCNQ FGQQE P DS NKE 1 E S P ARRT Y S VS F P M FS KI AVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQITALVRKLILLKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRS P FS S FIi PR PI CLS LE ARPCS 1 E DRRNW S L IGRPGA P AS 
GIjNRSSGLWLGPDRCRPR5RCSCRVMENPSPAAALGKALCALLL 
ATU3AAGQPLGGES ICS ARAPAKYS ITFTGKWSQTAFPKQYPLF 
RPPAOWSSIiLGAAKSSDYSMWRKNQYVSNGLRDFAERGEAWALM 
KEIEAAGEALQSVHAVFSAPAVPSGTGQTSAELEVQRRHSLVSF 
WRIVPSPDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSG? 
TFSSPNFATIPQDTVTEITSSSPSHPANSFYYPRLKALPPIARV 
A uLMXtmjii fK/U' X xVAi? vh rSRDNEI VDSASVPETPIiDCEVSLW 
SSWGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEEAECVP 
DNCV 


5590 " 


72 


896 


LCSSGALRLLPAMVAWRSAFLVCLAFSLATLVQRGSGDFDDF'NL 
EDAVKETS S VKQ PWDHTTTTTTNRPGTTRAPAK PPGSGLDLA DA 
LDDQDDGRRKPGIGGRERWNHVTTTTKRPVTTRAPANTLGNDFD 
IiADALDDRNDRDDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRYGSNDDPGSGMVAEPGTI AG VASAIAMALI GAVSS Y I S Y 
QQ KKFCFS I QQGLNAD YVKG ENLEAVVCE E PQVKYSTliHTQS AE 
PPPPPEPARI 


5591 


68 


1494 


AGSSRRAAABRLLVSAGCRSLAGRASGVLLLPAELLPGEEEAMA ' 

DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPBPEPEPVKEEKLSPEPILVDTASPSPMETSG 

CAPAEEDLCQAFSDV ilavndvdaedgad pnl cs e yvkd iyayl 

RQLEEEQAVRPKYLIjGREVTGNMRAILIDWLVQVQMKFRLLQET 
MYMTVS I IDRFMQNNCVP KKMLQLVGVTAMFIAS KYEEMYPPB I 
GDFAFVTDNT YTKHO I ROMEMKT L.RALN FGT <G R P T. P T .WFT .T?P n <! 

KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
I LDNGE WTPTLQH YLS YTEES LL PVMQHLAKNAAMVNQG LTKHM 
TVXNKYATS KHAKI STL PQLNS ALVQDLAKAVAKV 


5592 


242 


924 


YGESKDWNQKI)LLSAIiVLTTWCLPTPIMAK5AEVKLAIFGRAG " 
VGKS ALWRFLTKRFI WE YDPTI/ES TYRHQATIDDE WSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
IKKP KNVTL TLVGNKADLDHSRQVS TEEGE JCLATELACAFYECS 
ACTGEGNI TE 1 FYELCREVRRRRMVQGKTRRRS STTHVKQAINK 
MLTKISS 


5593 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDLNHYRlGKlEGFEVLXKVKTbCIiRQNLlKCIEKIjEELQSLR 
8 LDL YDNQ I KKI ENLEALTELE I LD I SFNLLRNIEGVDXLTRLK 
KLFLVTJNKISKIENLSNLHQLQMLELGSNRIRAIENIDTLTNLE 
S L FLG KNK I TKLQNLDALTHLTVLSMQS NRX»TKIEGI*QWI*VNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHLTELQ 
EFWI-INDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

»" n*onAiic , ^--vy aLcXiic , U-.H5pdrLlC AC J. Q , Ess 

Glutamic Acid, F=Phenylalanine. G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R»Arginine, 
S«Scrine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLAiiPS VRQ I DAT F VR F 


5594 


3 


1113 


HASGGRAAN>1AAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKIX5EERGEEDPEE5HBLPVDMBTINLDRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCLRQNLI KCIENLEELQSLR 
ELDLYDNQI KKI ENLB ALTELEI LDI S FNLLRNI EGVDKLTRLK 
KLFL VNNK IS K I ENLSNLHQLQML ELGSN R I RA I 2NI DTLTfJLE 
S L FLG KNK I T KLQNLDALTNLT VLSMQSN RLT KI EGLQNLVNLR 
ELYLSHNGIEVIEGLBNNNKLTMLDIASNRIKKI3NISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQIDATFVRF 


5595 


3 


1475 


ARWNGRWVQVPAHPGPGCGTNASGERQRQLPRAWRPVGRTLGSE 
PIALAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LGI PTVPGKVTLQKDAQNIjIGIS IGGGAQYCPCLYI VQVFDNTP 
AALDGTVAAGDE I TGVNGRS IKGKTKVE VAKMIQEVKGE VT I HY 
NKLQADP KQGMS LDI VLKKVKHRLVENMSSGTADALGLSRAI LC 
NDGLVKRLEELERTAELYKG>1TEHTKNLLRAFYELSQTHRAFGD 
v r & v x \3 v ntir\i vAAis ISA t v aFADAH R S I EKFG I RiiLKT I KPMLT 
DLNTYLNKAIPDTRLTIKKYLDVKFEYLSYCLKVKEMDDEEYSC 
I ALGEPLYRVS TGNYE YRL r LRCRQEARARFS QMRKDVLEKMSL 
LDQKHVQDI VFQLQRL VSTMS KYYNDCYAVLRDADVFP I EVDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


5596 


698 


219 


v»/» v jjrt f so uf/\j\riiiHH\jLt a. & yi> ij& uij&N JToR PTSE ViKISFIFP 
NGDKYDGDCTRTS SG I YERNGIG IHTTPNG I VYTGS WKDDKMNG 
FGRLEHFSGAVYEGQFKDNMFHGLGTYTFPNGAKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 . 


731 


ISCKMAADGQSSLPASWRSVTLTHVEYPAGDLSGHLLAYLSLSP 
VFVI VGFVTLI IFKRELHTISFLGGLALNEGVNVJLI KNVIQEPR 
PCGGPHTAVGTKYGMPSSHSQFMWFFSVYSFLFLYLRMHQTNNA 
R FLDLLW RHVLSLGL LAVAFL VSYS R VYLLYHTWS Q VLYGG I AG 
GLMAIAWPIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKLQ 


5598 


326 


2440 


G IGP I AAS FI FCKVASLYI FLS PPP PSVSGVPYS PANSS WS CAX, 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDS PGLGRLENKI FS VTNNTECGKLLEE 
IKCALCSPHSQSLFHSPBREVLERDLVLPLLCKDYCKEFFYTCR 
GHI PG FLQTTADE FC F YYARKDGGLC FPD FPRKQ VRG P ASNYLD 

LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERVIAIGPHDHILRWEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLY I ILGDGM 
ITLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYS I PRSNPH FNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSBRL 
YGS YVFGDRNGNFLTLQQS PVTKQWQE KPLCLGTSGS CRGYFSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


244 0 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVP PHPPAPS PCCSGQTMLKMLS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRKMSQLELLSGG 
EMLCGGF YPRLSCCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHSQS LFHS PERB VLERDLVLPLLCKDYCKEFFYTCR 
GHIP3FLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRVVEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
I TLDDMEEMDGLS DFTGS VLR LD VDTDMCWVP YS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
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ID 

NO: 


rceaicceQ 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid* E- 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








siiK x Ufj± x ax* ajj x fib a P S UliB r KP F SNG PLVGG FVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYIIiSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRAT VQPAQTLTS ECSRbCRNGYGTPTGKCCCS PGWEGDFCRTG 


5600 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDM C FEGMKP VNQTAASNKGLRGLLH PQQLHLLS RQLED PNGS F 
SNAEMS ELS VAQ KPEKLLERCKYWPACKNGDECAYHHP I S PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKNECPFYHPKHCRFNTQCTRPDC 
TF YHPTI NVP PRHAIiKWI RPQTSE 


5601 


197 7 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDM CFEGMKPVNQTAASNKGLRGLLiHPQQLHLLSRQLED PNGS F 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRI P VLS PKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TF YHPTI NVP PRHALKW I R PQTSE 


5602 


246 


766 


YHTSCTVWRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRHAEMVAIDQVLDWCRQSGKSPSEVFEHTVbYVTVEPCIMC 
AAALRLMKIPLWYGCONERFGGCGSVLNIASADLPNTGRPFQC 
IPGYRAEEAVEMLKTFYKQENPNAPKSKVRKKECQQILNMF 


5603 


1 


• 565 


FRGRT P I SGGERGCAQY P I PATPARSGENRTM PGAGDGGKAPAR 
W LGTGLLGLFLL P VTLS LE VS VGKATD I YAVNGTE I LL PCTF S S 
C FG FBDLHFRWTYNS SDAFK I L I EG T VKNE KSDP KVTLKDDD R I 
TLVGS T KEKRNN 1 S I VLRDLE FSDTGKYTCRVKNP KENNLQHHA 
T I FLQ WDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT 
DQGQVALGGHYMAEGEGYFAMS EDELACS p YI PLGG DFGGGD FG 
GGDFGGGDFGGGDFGGGGS FGGHCLD YCES PTAHCNVLNWEQVQ 
RLDGILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKRIGVR 
d VRLNGS AASHVLHQDSGLGYKDLDL I FCADLRGEGEFQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 
NPMTETFHPTI IGES VYGDFQEAFDHLCNKI I ATRNPEEIRGGG 
LLKYCNLLVRGFRPASDEIKTliQRYMCSRFFIDFSDIGEQQRKL 
E S YLQNH FVG LEDRK YE YLMTLHG WNEST VC LMGH E RRQTLNL 
ITMLAIRVLADQNVI PNVANVTCYYQPAPYVADANFSNYYIAQV 
QPVFTCQQQTYSTWLPCN 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAPVRLGRKRPIjPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRS LRR Y PLPLRSGKEAK I LQH FGDGLCRMLDERLQRERTS G 
GDHAPDS PSGENS PAPQGRLAEVQDSSMP VPAQPKAGGSGS YWP 
ARHSGAR VI LLVLYREHLNPNGHHFLTKEELLQRCAQKS PRVAP 
GSARPWPALRSLU IRNLVLRTHQPARYS LTPEGLELAQKLAESE 
GLS LLNVG IGPKE P PGEETAVPGAAS AELASEAGVQQQPLELRP 
GEYRVLLCVDIGETRGGGHRPELLREIjQRtjHVTHTVRKLHVGDF 
VWV AQETN P RDP ANPGE LVLDHI VE RKR LDDLCS S 1 1 DGRFREQ 
K FRL XRCGLERR VYL VE EHGS VHNLS LP E STLLQAVTNTQ VI DG 
FFVKRTADIKESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
SG AMTS PN P LCS LLT FS DFNAGA I KN KAQS VRE VFARQLMQ VRG 
VSGEKAAALVDR YS TPAS LLAAYDACATP KEQETLLSTI KCGRL 
QRNLG PALS RTLS QL YCS YG PLT 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSSVGSISEEETCEKLKGLIQROVQMCKRNLEVMDSVRRGAQLA 
I E ECQ YQFRNR RWNCS TLDSLPVFGK WTQGTR EAAFVYA I S S A 
GVAF AVTRACS SGELE KCGCDRT VHGVSPQG FQWSGCSDN I AYG 
VAFSQSFVDVRERSKGASSSRALMNLHNNEAGRKAILTHMRVEC 
KCHGVSGSCEVKTCWRAVPPFRQVGHALKEKFDGATEVEPRRVG 
SS RALVPRNAQ FKPHTDE DLVYLE PS P DFCBQDMRSG VLGT RGR 
TCNKTS KAIDGCEHjCCGRGFHTAQVELAERCS CKFHWCCFVKC 
RQCQRLVELHTCR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
{A=Alanine, C«Cysteine, D=Aspartic Acid, E=s 
Glutamic Acid, F= Phenylalanine. G=Glycine. 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine. N^Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T«Threonine, V=Valine, 
W«Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


" 5607 


521 


141 


PPVCNPAKAMPSPGTVCSLLLLC^MI.WbDLAMAGSSFLSPEhQRV 
QQRKESKK P PAFCLQPRAIiAGWLRPBDGGQAEGAEDELEVRFNAP 
FDVGIKLSGVQYQQHSQALGKFLQDILWEEAKEAPADK 


5606 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCX3DAVGYPLDTVKV - 

RIQTEPKYTGIWHCVRDTYHRBRVWGFYRGLLLPVCTVSLVSSE 

VFGTYRHCLAHICRLRFGNPDAKPTKADITLSGCASGLVRVFLT 

SPTEVAKVRLQTQTOAQKQQRRLSASGPLAVPPMCPVPPACPBP 

KYRGPLHCLATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 

VLCEWLSPAGHSRPDVPGVLVAGGCAGVLAWAVATPMDVIKSRL 

QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 

MWFVAYEAVLRLARGLLT 


5609 


1628 


304 


AKG VW VL PS P P PR PGRG AL VSGSGLRRGRS GTS WR P R RMNH KS K 
KRIR E AKRS AR PELKDS LDWTRHNY YESFSLS PAAVADNVERAD 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCGEDNDGYSVKMKMKYYIEYMESTRDDSPLYIFDSSYG 
EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTG I H I DP LGTS AWNALVQGHKR WCLF PTST PR ELI KVT R DEGG 
NQQDEAITWFNVIYPRTQLPTWPPBFKPLEILQKPGETVFVPGG 
WWHVVLNLDTTIAITQNFASSTNFPVVWHKTVRGRPKLSRKWYR 
ILKQEHPELAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SE CE S GS EGDGTVHRRKKR RTCS MVGNGDTTS QDDCVS K E R SSS 
R 


5610 


54 


1196 


LERTPASADI^WTKYQLFLAGLMLVTGSINTLSAKWADNFMAEG ' 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQ P FNPLLFLPPALCDMTGTS LMYVALNMTS ASS FQMLRGA 
VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVWGLADLLSKH 
DSQHKLSEVITGDLLI IMAQI IVAIQMVLEEKFVYKHNVHPLRA 
VGTEGLFGFVILSLLLVPMYYIPAGSFSGNPRGTLEDALDAFCQ 
VGQQPLIAVALLGNISSIAFFNFAGISVTKELSATTRMVLDSLR 
TWIWALSLAI^WEAFHALQILGFLILLIGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTPINDAS 


5611 


2 


577 


FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPALAMSG 
ELSNRFQGGKAFGLLKARQERRLAEINREFLCDQKYSDEENLPE 
KLTAFKEKYME FDLNNEGE I DLMSLKRMMEKLGVPKTHLEMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVLKLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPEMPQYGEE^HIFELMQAMWLCKHLNS 
S LLT LENLILNEFS YTATEARRLYLQRKT VPS ALLVQLIQERLA 
EEDCIKQGWILDGIP ETR EQ ALRIQTLG I T PRHV I VLS A PDTVL 
IERNLGKRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISELET 
AQKLLEYHRNIVRVIPSYPKILKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVLLLGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRNLFFFLCLNLSFAFVELLYGIWSNCLGLISDSFHMFFDST 
AI LAGLAAS VISKWRDNDAFS YGYVRAE VLAGFVNGLFLI PTAF 
FI FSEGVERALAPPDVHHERLLLVSILGFWNLIG I FVFKHGGH 
GHSHG S GHGHSHS L FNGALDQAHGHVDHCHS H EVKHG AAHSHDH 
AHGHGHFHS HDG PS LKETTG PS RQ I LOG VFLHI LADTLGS I G VI 
ASAIKMQNFGLMIADPICSILIAILIWSVI PLLRESVGILMQR 
TP PLL ENS LPQC YQ R VQQLQG V YS LQ EQH F WT LCS D VYVGTLKL 
I VAPDADARWI LSQTHNI FTQAG VRQLYVQ I DFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER"" 
APWGARQRLGVMAELQQLQEFEIPTGREALRGNHSALLRVADYC 
EDNYVOATDKRKALEETMAFTTQALASVAYQVGNLAGHTLRMLI) 
IiQGAALRQVEARVSTIiGO^TVNMHMEKVARREIGTLATVQRLPPG 
QKVI A PENLP PLTP YCRR PLNFGCLDDI GHG I KDLS TQLS RTGT 
LSRKSI KAPATPASATLGRP PR I PEPVHLPWPDGRLSAASS AS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
3PDE PSWVPAS YLE KWTLY P YTS QKDNE LS FS EGT V I C VTRRY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - " 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X-Onknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 
SDGWCEGVSSEGTGFFPGNYVEPSC " — 


5615 


9 


1558 


AbGRRRPGDPREMEAAATPAAAGAAKREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGISFVQTLMKIiLKGNIGTG 
liiAjijFi^iiuvAUivijGPiSLVFIGIISVHCl'IHILVRCSHFLCLR 
FKKSTLGYSDTVSFAM2VSPWSCLQKQAAWGRSWDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LR I YMLC FL P F 1 1 LL V FI R EL KNL FVLS FLANVSMAVS L VT I YQ 
YWRNKPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGWLPLENQ 
MKESKRFPQALNIGMGIVTTLYVTLATLGYMCFHDEIKGSITLN 
1* FUU vw jj yg S VK I L YS FG I FVTY S I QF YV PAE 1 1 1 PG I T S KFHT 
KWKQICEFGIRSFLVSITCAGAILIPRLDIVISFVGAVSSSTLA 
LILPPLVEILTFSKEHYNIWMVLKNISIAFTGWGFLLGTYITV 
EE 1 1 YPTP K WAGTPQS P FLNLNSTCLTSGLK 


5616 


1 


719 


DDFVRCGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHLSSGDLLRDNMLRGTEIG VLAKAFI DQGKLI PDDVMTRLAL 
HELKNLTOYSWLLDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 
IKQRDTARWIHPASGRVYNIEFNPPKTVGIDDLTGEPLIQREDD 
KPETVIKRLKAYEDQTKPVLEYYQKKGVLETFSGTETNKIWPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


P WRGRGSR PRG AGAMAE EQ VNRSAGLAPDCEASATAE TT VS SVG 
TCE AAG KS P E P XDYDSTCVFCR I AGRQ DPGT ELLHCENEDL I C F 
KD I KPAATHHYLWPKKH I GNCRTLRKDQVE LVENMVTVGKTI L 
ERNNFTDFTNVRMGFHMPPFCSISHLHLHVLAPVDQLGFLSKLV 
YR VNS YWF I TADH LIE PCLRT 


5618 

• 


3 


1692 


YIJ^YINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKSIRLLSEIEKLVGTSVPGLLEIILSSSILEIYN 
H I LQTWPDEDVT PR KS CATKP KLSN I NQE EAS GTSLHQKA I MT 
FTCHNEINAFWLSRGSQILSLNSTRFLTKLGHCSSACPSDSVS 
QTN I QNLKGLNS P VLI G KS KDPS CV AKVS EEG K P AI GTQKME LH 
VRWRSDTGKCVDASPLWIPTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKW2QI LGDR I ESSACVS KCGNF I WGC YNGLVYVLKSNSG 
EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKXC 
VWKSKCGGTVFSS PCLNLI PHHLYFATLGGLLLAVNPATGNVIW 
KHSCGKPLFSSPQCCSQYICIGCVDGNLLCFTHFGEQVWQFSTS 
GPIFSSPCTSPSEQKIFFCSHDCFIYCCNMKGHLQWKFETTSRV 
YATPFAFHNYNGSNEMLLAAASTDGKVW I LESQSGQLQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSLGSPPGAGRGCPCP ' 

AQSLHSHQLAAWDPLKPSLRSYPPHLLQHPQLRSLTASSGHLGR 

RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 

LWGTKGRGSGSPSSPGCCLHPPAQHSQDLPLVHVDVGWQPPLGP . 

TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAIEAIKLGST • 
AIGIQTSEGVCLAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 
LI ADAKTL I DKARVETQNHWFTYNETMTVESVTQAVSNLALQFG 
c cl/mu i'ljiiMiK V r L» VAJjJj FGG VDEKG PQLFHMDPSGTFVQCDAR 
AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVMEEKLNA 
TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 
~~5?22 


3 


819 


VVEF VE YTATDAN VKNES LS S VQQLG I KMTVR YG KFLS LLKDGA ~~ 

ENDLTWVLKHCER FLKQQQTS I KSSLLCLQGNYAGHDW FVSSLF 

MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 

GIHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQICLQWIT 

QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 

HTQTQDLQVFLKEEALHGFRVS D YFE YMEI LEQN YRTVLLRDMR 

NIRLQST 




1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRNPMKAMY 
PGTFYFQFKNLWEANDRNETWLCFTVEGIKRRSWSWKTGVFRN 
QVDSETHCHAERCFLSWFCDDILSPNTKYQVTWYTSWSPCPDCA 
GEVAEFLARHSNVNLTI FTARLY YFQYPCYQEGLRS LSQEGVAV 
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SEQ 
ID 

MA. 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine. G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X«UnJcnown, +=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








ISI^YBDFKYCWENFVYNDNEPFKPNKGLKTNFRLLKRRLRg^IT" 
Q 


5623 


3 


954 


FLPFFIRAPK1SRNGQWLFTFTTPFPFANKALPGWEGIVPACFW" 
RKKILTPSTGTMHLLQVTILFLLPSICSSNSTCVLEAANNSLW 
TTTKPSITTPNTESLQramrrPTTGTTPKGTlTNELLKMSLMST 
ATFLTSKDEGLKATTTDVRKNDS I ISNVTVTSVTLPNAVSTLQS 
SKPKTETQSS I KTTEI PGS VLQPDAS PSKTGTLTSI PVTI PENT 
SQSQVIGTEGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 

VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5624 


159 


B98 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
SS G S RKLY FDTHALVCL LEDNGFATQQAE 1 1 VS AL VKI LEANMD 
IVYKDMVTKMCKJEITF0QVMSQIANVKKDM1ILEKSEFSALRAE 
NEKI KLELHQLKQQVMDE VI KVRTDTKLDFNLEKSRVKELYS LN 
EKKLLELRTE I VALHAQQDRALTQTDR KIETEVAGLKTMLES HK 
LDN IKY LAGS I FTCLTVALG FYRLW I 


5625 


1 


1180 


TI PS S AAACRAG P PAGALEALS PGGARAHAERRG EMRATPLAAP ' 
AGSLSRXKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATAS R LGP YVLLE PE EGGRAYQALHC PTGTE YTCR 
VYPVQEALAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHSLVRSRHRIPEPEAAVLFRQMATALAHCHQHGLVLRDLKL 
CR FVFADRERKKLVLENLEDS CVLTGPDDSLWDKHAC PAYVGPE 
ILSSRAS YSGKAADVWSIiGVALFTMLAGH Y PFQDS E PVLLFGKI 
RRGA YALPAGLS APARCLVRCLLRREPAERLTATG I LLH PW LRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 
"' S627 


3123 


2011 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAISI 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTORQVN 
ITVQ K KVS Q WWBRLTKQEKRP L FLAPD FDRWLDESDAEME LRAK 
EEERLNKLRLESEGS PETLTNLRKGYLFMYNLVQFLGFS W I F VN 
LTVRFC I LGKESFYDTFHTVADMMYFCQMIiAWETl NAAIGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
IFRYSFYMLTCIDMDWKVLTWLRYTLWIPLYPLGCLAEAVSVIQ 
S I PIFNETGRFS FTLPYPVKI KVRFSFFLQI YLIMI FLGLYINF 
RHLYKQRRRRYGQKKKKIH 




3123 


2011 


PPRALGSVAMEKQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TENVLHFKAQGHGAKGDNVYEFHLEFIiDIiVKPEPVYKLTQRQVN 
I TVQK KVS Q WWERLTKQ EKRPLFLAPDFDR WLDESDAEM ELRA K 
EEERLNKLRLESEGS PETLTNLRKGYLFMYNLVQFLGFSWI F VN 
LTVRFCILGKESFY0TFHTVADMMYFCQMLAWETINAAIGVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYSF YMLTCIDMDWKVLTWLR YTLWI PLYPLGCLAEAVSV1Q 
S I PIFNETGRFSFTLPYP VKI KVRFSFFLQI YLIMI FLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5628 " 


75 


1455 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 
SLSPVARS FSACSVGLGRSSYRATSCLPALCLPAGGFATSYSGG 
GGWPGEGILTGNEKETMQSLNDRLAGYLEKVRQLEQENASLESR 
IREWCEQQVP YMCPDYQS YFRTIEELQKKTLCS KAENARLWE I 
DNAKLAADDFRTKYETEVSLRQLVESDINGLRRILDDLTLCKSD 
LEAQ VES LKE E LLCLKKNHE E EVNS LRCQLGDRLNVE VD AAPP V 
DLNRVLEEMRCQYETLVENNRRDAEDWLDTQSEELNQQWSSSE 
wuvov.uftoi x aiiiut I vw/UjoIiSIiCAQHSMRDALESTLAETEARY 
SSQLAQKQCM I TNVEAQLAEIRADLERQNQEYQVLLDVRARLEC . 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCS AR P I CVPC PGGR F 


5629 


2287 


938 


GRPRS S SDN RNFLRERAGLS S AA VQTR IGNS AAS RRS PAAR P P V 
PAP PALPRGRPGTEGSTS LS APAVLWAVAWVWVS AVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTLQLFTDG I TNKLIGCY VGNTMEDWLVRI YGNKTELLVDR 
DBEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAIFRLIARQLAXIHAIHAHNGWIPKSNLWLKMGKYFSLIPTGF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, N^Asparagine , 
P= Proline . 0=Glufcamir>f» P=iwini«a 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AUh'UINKHFl^DIPSSQIUJEEMTWHKEILSNLGSPVVLCHNDr" 
LCKNIIYNEKQGDVQFIDYEYSGYNYLAYDIGNHFNEFAGVSDV 

-"V^ V' uivf x i uzirt i iVd r Mif V3 ibVl IIIvgVEILFIOV 

NQFALASHFFWGiiWALIQAKYSTIEFDFIiOYAIVRFNQYFKMKP 
HVTAIiKVPE 


5630 


1194 


278 


G FWA I AQTCAHHL P PGS P W LVPAS PWRLPEMSSFG YRTLT VAL F 
TL I CCPGS DEKVFEVHVRP KKLAVE P KGS LE VNCS TTCNQ P EVG 
GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 

NSNVSVY0PPROVII.TT.OPTT UaurVCCTTT3rinwnvtir.nT p.,,- „, 

LFI/FRG?JETLHYETFGKAAPAPQEATATFNSTADREDGHRNFSC 
LAVLDLMSRGGNI FHKHS AP KML E I YE P VS DSQMV I IVTWSVL 
LS L FVTS VLLC FI FGQH LRQQRMGTYG VRAAWR RL PQA FR P 


5631 
5632 


1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARKAATMSVFGKLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAXKHG 
TKNKRAALQALKRKKRYEKQIjAQ IDGTLSTI EFQREALBNANTN 
TEVLKNMGYAAKAMKAAHDNMDIDKVDELMQDIADQQELAEEIS 
j. **. j. o r*.*- v v? c ijis £, t uwjfc JjMAfc. JL»fc B LEQE E LDKNLLEISGPETVP 

LPNVPS I ALPSKPAKKKEEEDDDMKELENWAGSM 


5633 


3 


952 


WLGWSPPRRLWWGSLGAAQRPAVPVSGLARSLHVETRRPHRRA 
o v k v/\kv?kjjU V w AQ PQP IJj P R P vGS RRE MQP PG P P PAYAPTNGD 
FTFVSSADAEDLSGSIASPDVKLNLGGDFIKESTATTFLRQRGY 
GWLLEVEDDDPEDNKPLLEELDIDLKDIYYK1RCVLMPMPSLGF 
NRQWRDNPDFWGPLAWLFFSMISLYGQFRWSWIITIWIFGS 
LTIFLLARVLGGEVAYGQVLGVIGYSLLPblVIAPVIiLVVGSFE 

v »tp *uj.i\ucvavr nrtrtiaMrtcaLiijVolSiSr A.I KKPLLX YPIFLLYI Y 

FLSLYTGV 




771 


460 


qgcsktmsvgrpfyrssefmeqllsshlhqvpffccftwcl.cn 

CLFBNSVSKLYMLCFNFFMSIFFYSLSITKLNLIYLWGLSYQSL 
LLLLLSGHRPWGSSMV 


5634 
5635 


1446 


855 


pratgrirsraaasrpragagasgaeprsgrersrlsgrrapam " 

arntlssrfrrvdidefdenkfvdeqeeaaaaaaepgpdpsevd 

gllrqgdmlrafk^lrnspvntkiniqavkeraqgvvlkvltnfk 

" v vc>iji^KiiL> vjji^ijroA 1 1 iKC^rEKPTENSSAVLLQWHEK 

alavgglgs iirvltarktv 




3 


• 943 


DRGPRST ATDTGRARVS FWRFPLDPGVKNSNVQISGEKRRFRTL"" 

rslfhpfpvtrsgapravlvgsswpakmvapavkvargwsglal 
gvrravlqlpgltqvrwsryspefkdplidkeyyrkpveeltee 
ekyvrelkktqlikaapagktssvf^dpviskftnmmmiggnkv 
larslmiqtleavkrkqfekyhaasaeeqatiernpytifhoal 
kncepmiglvpilkggrfyqvpvplpdrrrrflamkwmitecrd 
kkhqrtlmpeki^hklleafhnqgpvikrkhdlhkmaeanrala 

HYRWW 


5636 


2253 


1143 


ledticqhppaekklylyhrklreverKgiprlpkdvfmdthqg 
ltdvrakvtgfsegwdsvkggfssfsqathsaagawskprei 
aslirnkfgsadnipnlkdsleegqvddagkalgvisnfqsspk 
ygseedcssatsgsvgansttggiavgasssktntldmqssgfd 
allheiqeiretqarleesfetlkehyqrdyslimqtlqeeryr 
cerleeqlndltelhqneilnlkqelasmeekiayqsyerardi 
qealeacqtriskmelqqqqqqwqleglenatarnllgklini 

liavmavllvfvstvancvvplmktrnrtfstlflvvfiaflwk 
hwdalfsyverffsspr 


5637 


946 


2532 " 


msfcgaranaxmmaaynggtsaaaaghhmhhhhhlphlppphlTT" 

hhhhpqhhlhpgsaaavhpvqqhtssaaaaaaaaaaaaamlnpg 

qqqpyfpspapgqapgpaaaapaqvqaaaaatvkahhhqhshhp 

qqqldiepdrpigygafgwwsvtdprdgkrvalkkmpnvfqnl 

vsckrvfrelkmlcffkhdnvlsaldi lqpphi dyfee I YWTE 

lmqsdlhkiivspqplssdhvkvflyqilrglkylhsagilhrd 

I KPGNLLVNSNCVLKICD FGLARVEELDESRHmtQEVVTQYYRA 
PEZLMGSRHYSWAIDIWSVGCIFAELLGRRiLFQAQSPlQQLDL 

itdllgtpsleamrtacegakahilrgphkqpslpvlytlssqa 
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SBQ 
ID 

NO: 


Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, GsGlycine, 
n=niatioine, i=iEoieucine, K*Lysine, 
L= Leucine, M~Methionine, N»Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y=Tyrosine, X=Un)tnown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAWLLCRMLVFDPYKRISAKDALAHPYLDEGRLRYHTCMCK 

s '* ** * i *. our firV l«rM UU X I lilvFitjooVKQ VKE I IHQF 

1 LEQQ KGNRVPLC INPQSAAFKS FI SSTVAQ PSEMPPS PLVWB 


563B 


125 


1155 


DRKMS ELDQLRQEAEQLXNQIRDARKACADATLSQITNN I D PVG 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTNKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGNVRVSREIAGHTCYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
r. 1 1 &nu.Nii lu ITSVS FSKSGRLLLAGYT3DFNCNVWDALKADRA 
G VLAGHDNRVS CLGVTDDGMAVATGS WDS FLKI WN 


5639 


125- 


1155 


DR KMS E LDQLRQ EAEQLKNQI RDAR KAC ADATLS Q I TNN I D P VG 
RIG^RTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TWKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRliFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNIICGITSVSFSKSGRLLLAGYDDFNCNVWDALKADRA 
GVLAGHDNRVS CLG VTDDGr4AVATGS WDS FLKIWN 


5640 

i 


2B0 


1092 


yUOW KK A MLtSHN 1 MMKORKyyATAi VHGNDVDGMDLGKKVS 
IPRDIMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELL8ALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
LTD PR FMSFVNPLSGRRS FNRTPKGW I SENI PI VITTEPTDDTT 
VPESEDL 


-1 5641" 


27 


332 


CRHNCNGDVKLLSNQMDKLFAFHLFTFHGLLHFLDGSIQKLIQA 
KJ.iijbUNo3XLVLENNFLFKvECSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 - 


ITPCRMDFLVLFLFYLASVLMGLVLICVCSKTHSLKGLARGGAQ 
I FS C 1 1 P ECLQRAMHGLLH YL FHTRNHT FI VLHLVLQGM V Y T E Y 
TWE VFG YCQEL ELS LHYLLL P YLLLG VNLFFFTLTCGTN PG 1 1 T 
KAKBLLFLHVYEFDEVMFPKNVRCSTCDLRKPARSKHCSVCNWC 
VHRFnHHCVWVNNCIGAWNIRYFLIYVLTLTASAATVAIVSTTF 
Ajvnijv vrjauijiytx i IULjLCiHLHVMDTvFLIQYLFLTFPRIVFM 
LGFVWLS FLLGG YLLFV LYLAATNQTTNE WYRGDWAWCQRC P L 
VAWPPSAEPQVHRNIHSHGLRSNLQEIFLPAFPCHERKKQE 


5643 


1 


847 


PSGG VRD VETRG PG S RAARG PR VVMKRRG VG AG AI AKKKLAEA K 
YKERGTVLAEDQLAQMSKQLDMFKTNLEEFASiCHKQEIRKNPEF 
RVQFQDMCATIGVDPLASGKGFWSEMLGVGDFYYBLGVQIIEVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 
ALGTGFGriPVGGTYLIQSVPAELNMDHTWLQLAEKNGYVTVS 
EIKASLKWETERAROVLEHT,T.lTPf!TiawT rm^nnr>z?Ktrvxjr o*t t? 

M<vn a a on/my v UjQHJjjj|\JHjJjfiW laUJj\jAir\j e» AM Y W 1 1 PAT i T< 

TDLYSQE I TAEEAREALP 


5644 


83 


113 8 


* * >4,J ,wt "' ¥ v u * vvav^yivnjrun jl vftvjy/ yiSKAKfc TEEVIEYFQ 
KKVSPVHLKILLTSDEAWKRFVRVAELPREEADALYEALKNLTP 
YVAI ED KDMQQKEQQ FREW FL K E FPQ I RW K I QES I ERLR VI ANE 
IEKVHRGCVIANWSGSTGILSVIGVMLAPFTAGLSLSITAAGV 

GLGIASATAGIASSIVENTYTR^AFLTAQDT.TaTCTnriT IT7\t nn 

I LHDIT PNVLS FALDFDEATKM I ANDVHTLRRS KATVGRPLIAW 
R YVP I NWET LRTRGAPTR I VR KVARNLG KATSG VLWLDWNL 
VQDSLDLHKGEKSESAELLRQWAQELEENLNELTHIHQSLKAG 


5645 


537 


799 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASSQVPTL 
YLCI^NSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


564 6 


3745 


3328 


AEQYGTSPHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS IPATKRASFLSS FI KMFFEELE YILGF 
LSLLKFHVHVSVYSAICHFQKEGTGNSRSFTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


GVI MATS ELS CEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
EEDTQRHETYHQQGQCQVLVQRSPWLMMRMGILGRGLOEYQLPY 
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SEQ 
ID 
NO: 



5649 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amno acid segment containing signal peptide" 
<A=Alanine, C=Cysteine. D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, Methionine, N=Asoaragine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=*Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«posslble nucleotide insertion ) 
UK VbPL P I Fl'fAKMGATKEEREDT P IQLQE L LALET AI/3GQCVD" 

ROEVAETTVriT.DDtn/mrevrii^AT nnor nnr... n » n . ^ 



1518 



1172 



5650 



1172 



3006 



3006 



5651 



646 



5652 
565* ■ 



~735" 



~6T 



"34T 



1401 



RQEVAEITKQLPPWPV5KPGALRRSLSRSMSQ EAQRG 
VLSSLCGRHBALREVGAEWPPPTCSPNICSGLQQAGNTDWSLTM 
I APQSLPSSRMAPLGMLIX3LLMAACFTFCLSHQNLKEFALTNPEK 
SSTXETERKETKAEEELDAEVLEVFHPTHEHQALQPGQAVPAGS 
HVRLOTXJTGEREAKLQYKDKFRNNLKGKRLDINTNTYTSQDLKS 
ALAKFKEGAEMESSKEDKARQAEVKRLFRPIEELKKDFDELNW 
IETDMQIMVRLINKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 
LLS FGGLQ WINGLNSTE PLVKEYAAFVLGAAFSSNP KVQVEAI 
BGGALQKLLVILATEQPLTAKKKVLFALCSltLRHFPYAQRQFLK 
LGGLQVLRTLVQEKGTEVLAVRWTLLYDLVTEKMFAEEEAELT 
QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREXV 
LQTLGVLLTTCRDRYRQDPQLGRTLASLQAEYQVIASLELQDGE 
DEGYFQELLGSVNSLLKELR 

KLQEQLDA1NEEIRMIQEEKESTELRAEE1 ISTRVTSGSMEALNL 
KQLRKRGSIPTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 
MG VMTLPSDLRKHRR KLLS PVSREENR EDKAT I KCE TS P P S S PR 
TLRLE KLGH PAbSQE EGKS ALEDQGS NPS S S NS S CJDS LH KG A KR 
KC-IKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTVVSWL 
ELWVGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
LKIiRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
LEKRR EESQHE I KD VLVWTNDQWHWVQS I GLRD YAGNLH E S G V 
HGALLAIjDENFDHNTLALILQIPTQNTQARQVMEREFKNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPE AHSHYLYGHMLSAFRD 
MLQEQLDAINEEI RM IQEBKES TELRAEEI ETRVTSGSMEALNL 
KQLRKRGS1PTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLS P VS REENREDKATI KCETS PPSS PR 
TLRLEIOGHPALSQEEGKSALEDQGSNPSSSNSSODSLHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
LKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
LEKRREESQHEIKDVLVWTNDQWHWVQSIGLRDYAGNLHESGV 
HGALLALDENFDHNTLALILQI PTQNTQARQVMERE FNNLJLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
QFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGH MLSAFRD 
ARQGQRQPWG^EARAKGPASESPRV^EGSGWEGPASP+TPGSTL" 



AWG EG AG I R * ASGLTAAG AAS AAAA/ PP P TRGG P APAG CGRA P P 
WPAPLRVPTHGRAPAPRS RAAPRAPALSHGTAAAALS PAS PAGP 
ADP * L PGHSSQS P PRG * RWGR S RS A PAPAH PEH PA PAGSAS ASQ 
QTPG WPGSCCL AQGWQ AE PLGA PGAE DG \ P VP P QRG FP LGTLGS 
PAGSWAGLAGYG*AGAPGTQATAPRAAGQTPVAAAPNCRV+GSA 
PALHRAPAAADPGSPLQAPPRAWASPAAAGPGLSSSDYCGGLGA 
GWRAGISPEIiIjGAAGLSDNWARCPGPG pae *ggqpgcrti pasa 

CMPSPPVEGSLGLSRKGHGDLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 

HHKKYQHIHQKSFSCPEPACGKSFNFKKHLKEHMKLHSDTRDYI 
CE FCARS FRTS S N LV IHRR I HTGE KPLQCE I CG FTCROKAS LNW 
HQRKHAETVAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 



RGRLQSRGRLTLGLVLLLLDltGARQHGQRVSHGWKGGFLTAPL 
| CFPQPCQPGTRRGRRRSLKEATEPQLAMAEEFVTLKDVGMDFTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
LEPItAGGSPEATSPDVTETKNSPLMEDFFEEGFSQEl/SRDVIQ 
1 GWLLELQFRRSLYRGHLVR»FARRSRKSSEV*YCHQRGKSHGMQ 
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SEQ 
ID 
NO: 


freuicLea 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine f C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, -R=Arginine, 
S=Serine, T=Threonine, VoValine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








ES* ikertqscvhrfhgrrfhg\dnvsektltpAkskeyrgsfp 
s ysdhsqqds vqege kp yqcs e cgxs fsgs yrltqhw i thtre k 
ptvhqeceqgfdrkashsgypkthtgykfyvcweygtpfsqsty 
lwhqicthagekpcksqdsdhppshdtqsgehqkthtdsksyncn 
ecgkaftrifhltrhqkihtrkryecskcqatfnlrkhliqhqk 

THAANV 


5654 


3 


598 


tlplfpgrrfrgwrrcgavaarknstggnvsinqrrdsvrmsal 

NWKPFVYGGLAS I TAECGTFPI DLTKTR FQ IOGOTNDAKFKBI I 
YRGMLHALVRIGREEGLKALYSG*VGLHAFLCHCSLFHMGIDFR 
PRLHRSQVKSLRCV* KEQIA* + /MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDHNKKICLFKNI 


5655 


2 


867 


RPPGI RAPRQLHPAAGRR PDASARPRFRPT VLLHDP FQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 

atdemipfkdegdpqXrekifaeivnpeeegdladiksslvnes 

EIIPASNGHEVARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PSYSSYSGYIMMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHPLTPLITYSDEHFSPGSHPSHIPSDVNSKQGMSRHPPAPDI 
PTFYPLSPGGGGQITPPLGWQGQP 


5655 


228 


1066 


PRRVPPLPEFASGPGAAFFHSGRLQRSLiTKDSAGCFSQCRSRAM " 

IjVLRSGIiTKALASRTIiAPQ VCSS FATG PRO YDG TF YEFRT YYLK 

PSNMNAFMENLKKNIHLRTSYSELVGFWSVEFGGRTNKVFHIWK 

YDNFPHRAEVRKALANCKEWQEQSIIPMLARIDKQETEITYLIP 

WSKLQKPPKEGVYELAVFQMKPGGPALWGDAFERAINAHVNLGY 

TKWGVFHTEYGELNRVHVLWWNESADSRAAVRHKSHEDPISWG 

GVRE S VNYL \ VS QQNM 


5-657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAABMNGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKL 
FRERLSQLRIiRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKLLLYDTLQGELQERIQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEID2LEDMTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQ S S WH CTQDSR LPP ADRRTHRP LR VCPARLL WCCW AL PLHL 
ALVWTPPL 


i 5658 


2346 


3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALLKCTDTELOLRRDA I FCQALVAAVCTFS EQLLAAH3Y 
RYNNNGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMLEDIWVTLSELDNVTFSFKQLDENYVANTNVFYHIEGSROA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLEMVEGLPS 
PGSQAAEDLQQD1NAQSLEKVQQYYRKLRAFYLERSNLPTDAST 
TAVKIDQLIRPINAIiDELCRLMKSFVHPKPGAAGSVGAGLIPZS 
SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAILARSHGLLPKC 
I MQATD I MRKQG PR VE I LAKNLR VKDQMPQG APRIiYRLCQ PKMN 
GDL 


5659 


2 


696 


WKRSGEVSPKGELGAWRGNSGRPKIIGRAAEAENEDRTLGRLLP 

GNERSQPRS PLRLLA PQLKAEAAADKGLAP VPPPFS SGHSGP C\ 

EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 

AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 

EDNPAGAGG\AAVAGAAGGARRFLCGWEGFYGRPWVlvlEQRKEL 
FRRLQ KWE LNT YI* 


5660 


229 


853 


PVTMWAFSELPMPLLINLIVSLLGFVATVTLIPAFRGHFIAAR'L 
CGQDLN KTS RQQ I PE SQ3V I S GAVFLI I LFCFI P FP FLNC F VKE 
QRKAFPHHEFVALIGALLAICCMIFLQFADDVLNLRWRHKLLLP 
TAASLPLLMVYFTNFGNTTI WPKPFR P I LGLHLDDGR * S YHCC 
P YGT YFRE P FLVLH I LLQVFL FCLCVF P D P FW 


5661 


2 


473 


LNLYPSPCGGI PKLPGLPREAAAALGAS FLAEAPLPVTVRGSGL 
AGMAVTCD P XAFLS I CFVTLVFLQLPLAS I CQN'GTDSCASRG K 
ADFDVTGPHAPILAMAGGHVELQCQLFPNISAEDMELRWYRCQP 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKLSVRDALGAQNASGERIKIQGWIRSVRSQKEVLF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=phenylalanine, G=Glycine. 
H^Histidine, I=lsoleucine, K=Lysine, 
L=»Leucine, FUMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, ^Threonine, V»Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /=poasible nucleotide deletion, 
\=possible nucleotide insertion) 








LH VNDGSS LESLQWADS GLDS R ELTFGSS VE VQGQL I KS PS KR 
QNVELXAEKIKVIGNCDAKDFPIKYKERHPLEYLRQYPHFRCRT 
NVLGSILRIRSEATAAIHSFFKDSGFVHIHTPIITSNDSEGAGB 
LFQl^PSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHuAEFYMIEAEISFVDSljQDLMQVIEELFK 
ATTMriVLSKCPEDVELCHKFIAPGQKDRL*HMLKNNFLIISYTE 
AVEILKQASQNFTFTPEWGADLRTEHEKYLVKHCGNIPVFVINY 
PIiTLKPFYMRDNEDGPQELEGSVA*HSLGLMILLSIWIGQP 


5663 

c CCA 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL 
VQS Y FE KG PLTFRD VA I E F S LEE WQCLDS AQQG L YR KVMLENYR 
NLVFLGIALTKPDLITCLEQGKEPWNrKRHEMVAKPPVICSHPP 
QDLWAEQDIKDSFQEAILKKYGKYGHANFQLQKGCKSVDECKVH 
KEHDNKLNQCLI PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTGYSLiVQENGQRKYG 
GPPPGWDAAPPERGCEI FIGKLPRDLFEDELI PLCEKIGKI YEM 
RKM^FNGNNRGYAFVTFSNKVEAKNAIKQLNNYEIRNGRLLGV 
CASVDNCRLFVGG IPKTKK 


5665 


347 


702 


WQHLIILLHCERTSPAMITSELPVLQDSTNETTAHSDAGSELE 
ETEVKGXRKRGRPGRPPSTNKKPRKSPGEKSRIEAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666 


213 


540 


VSCLPTSCKMITIiNNQDQPVPFNSSHPDEYKIAALVFYSCIFlI" 
GLFVNITALWVFSCTTKKRTTVTI YMMNVALVDLI FIMTLPFRM 
FY YAKDEWPFGEYFCQI LGA 


5667 


1 


69S 


HPLPSASLGLPSVSU3VSLCVRSALLEAWPHLPXRRnARVGS~ 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSS EATHWMEETSAEEAVSWQERRMAAAPPGCTP PALLD 
I SWLT ES LG AGQ P VPVE CRHRLEVAGP S KG PLS PAWM PA YACQR 
PTPLTHHNTGLSEALEI LAEAAGFEGS EGRLLTFCRAAS VLKAL 
PSPVTTLSQLQ 


5668 


691 


894 


CSPLFCtPDLFLQFLLGRKEEEAVLVGGEWSPSLDGLDPQADPQ 
VLVRTAI RCAQAQTGIDLSGCTKW 


5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPQAYTPFIYLHARKRRGEIGD 
ADSRFNDR YAHKSAQLYFLYFVCW I FQDVYY FTI KEKNHFFFPK 

ARGAPTKYSGS PIGS PTTTPPTRP PS FNLHPAPHLLASMQIiQKL 
NSQ 


5670 


3 


373 


ssecltmawiplllpllilctvsva^Velaqpssvsvspgqtak 

ITCSGDVLAKKYARWFQQKPGQAPVLVIYIGDTERPSGIPERFSG 
S TSGTTVTLTI SGAQVEDEAD Y FCYS ATDNFLWVF [ 


5671 


280 


524 


KFPPKKTPPHJbGMESAITLV7QFLLQLLLDQKHEHLICWTSNDGE 
F KLLKAKKVAK LWGLR KNKTNMNYD KLS RALRLL FMT 


5672 


2 


557 


FVPATPDPGWJLPPSRDPAMAKRSSIjYIRIVEGKNLPAKDITGS 
SDPYCIVKVDNEPIIRTATVWKTLCPn^GEEYQVHLPPTFHAVA 
FYVKDEDALSRDDVIGKVCLTRDTIASHPKGKFSIiPSHTGLPSP 
W P PSKS ETS PLCS VWS PAQG K P FLLS P E AG AT FCT PG LCS AACS 
QAWLLLPLP 


"5673 


327 


696 


ITVADQISHWSAGRIKNRTRIPECIHSSAATTLAGPHTMEGESV 
KLSSQTLIQAGDDEKNQRTITVNPAHMGKAFKVMNEIiRSKQLLC 
DVMI VAEDVEI EAHR WLAACS P Y FCAMFTGDMS 


5674 


17 


9B4 


GGGSMEGESTSAVLSGFVLGALAFQHLNTDSDTEGFLLGEVKGE " 

AKNSITDSQMDDVEWYTIDIQKYIPCYQLFSFYNSSGEVNEQA 

ijiuvijjan viuux v vvjW i a± KRHSDQIMTFRERIjLHKNLQEHFSNQ 

DIiVFLLLTPSIITESCSTHRLEHSLYKPQKGLFHRVPLWANLG 

MSEQLGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLKEVHKIN 

EMYASLQEELKSICKKVEDSEQAVDKLVKDVNRLKREIEKRRGA 

QIQAAREKNIQKDPQENIFLCQALRTFFPNSEFLHSCVMSIiKID 

MFLKVAVTTTTISM 


$675 _ 


eo 


753 


EGS RRG PTR LARLS ARAGRLH FP PGFS S RL I HFRG VS ECRR P PG 
KSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRWREPVRKVT 
LLMVGLDNAGKTATAKGIQGE YPEDVAPTVG FS KINLRQGKFE V 
TIFDLGGGIRIRGIWKNYYAESYGVIFVVDSSDEERMEETKEAM 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first: 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenylalanine, G=Glycirie, 
H=Histidine, Ialsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W.Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, -/=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEMLRHPRI SGKP ILVLANKQDKEGALGEAbvl ECLSLEKLVNE 
HKCL 


5676 


2 


930 


FVSS PPPRP VQPARPGG FGLSGRRSLLCQVASTPAHVGVMRS P V 
RDliARNDGEESTDRTPLLPGAPRAEAAPVCCSARYNLAILAFFG 
FFIVYALRVNIiSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
HNQTG KKYQWDAETQGW I LGS FFYG YI ITQI PGGYVASKIGGKM 
LLG FG I LGTAVLTL FTP I AADLG VG PL I VLRALEGLG EG VTF PA 
MHAMWSS WAPPLERS KLLS IS YAGAQLGTVIS LPLSGI I CYYMN 
WT YVFYFFGT I G I PWFLLW I WLVSDTPQKHKR ISHYEKEYILSS 
L 


5677 


1 


1028 


P PRDG FLE LRRLS V P LCSGPC PLTS LS RQGER SGGK b V AAARAA 
VTAETHPLPLLAPLAVCQSVKSPAACQVRPRPRAVALPAALGGP 
GRSLPGLTAATMSSFSESALFKKT.<?T?T,^NQnr>c\7rvrr ct uir ttttj 

RKHAGPIVSVWHRELRKAKSNRKLTFLYiiANDVIQNSKRKGPEF 
TREFESVLVBAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSFPPKATEEKKSLKRTFQOIQEEEDDDYPGSY- 
S PQDPSAGPLLTEEL I KALQDLENAASGDATVRQKI AS L PQEVQ 
DVS LLEK I TD KE AAERLS KTVD F. A HT .P NT) a DrtTc 


567B 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SLAE FTEOFN01iHNRRNENtjnTif!PT/5PnponT?r , c r rTrc nTncroo 

PGQLS PGVQFQRRQNQRRFSME VRASGALPRQVAGCTH KGVHRR 
AAALQPDFDVSKRLSLPMDI 


5579 


2 


£23 


lnsrvddfvavpgaimdedyygsaaewgdeadggqqeddsgege" 
ddaevqqeclhkfstrdyimepsi pntlkryfqaggs penviql 
lsbny tavaqtvnllae wl i qtg ve pvqvqetvenhlkslli kh 
fdprkads i fte eget p awleqm i ahttwrdl fy klae ah pdcl 

MLN FTVKVGRVLELRRKVFMNVYFW LLVCFL 


5680 


256 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSIJ^ADTANRPKPMV 
DPSCITPIQLAPMKTIVRGNKPC 


5681 


45 


869 


LLCAKTLGVRTKESQAEGYNRSGINNHQAEDPRFCPSFCWMRSA 
RQTRPQ RliRKEAARPPT PG S CPGGTGMDGKKCS VWMFL PLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFSQVMNMAAFLAL WAVLK F I QLKPKVLNP WLNI SGLVA 
LCLASFGMTLLGNFQLTNDBEIHNVGTSLTFGFGTLTCWIQAAIi 
TLKVNIKNEGRRVGIPRVILSASITLCVGPLLHPHGPKHPHVCS 
Q5PVGPGHVL 


5682 


39 


622 


PSRS CLGTMRKWRHREVNLPE VTQQDAVCPAP I PS PGLSAQTGL 
QKIWGTIHCQVCPGAPAWPGSPWHEBMGLLLLVPLLLLPGSYGL 
P FYNG FY YSNS ANDQN LGNGHGKDLLNG VKL WE TP EETL FT YQ 
GASVILPCRYRYEPAliVSPRRVRVKWWKLSENGAPEKDVLVAIG 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GSCGATALITRCLAWSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTDWHRYNLRRKVASMAPVTAEGFQERVRAQRAVAEEESKGS 
ATYCT VC S KK FAS FN AYE NH LKS RRHVELEK KA VQ A VNR KVEMM 
NEKNLE KGLGVDS VD KDAMNAA IQQAI KAQP S MS P KKA P PA PAK 

EARNWAVGTGGRGTHDRDPSEKPPRLQWFECQAKKLAKHSEDD 
SEDEEHDLC 


5684 


195 


577 


TWCFRGYLGPRVtMXALbEPPYLTVGTbVSAKYRGAFCEAKIKT 
AKRLVKVKVTFRHDSSTVEVQDDHIKGPLECVGAIVEVKNLDGAY 
QEAVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQLPLTNPEHFOTPVIG KKTNRGRR YE 


56B5 


779 


1262 


LLLQQ P WHCFLLFP P FR FSHHMI PG P PG PHTTG I PH PA i WP Q 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQI LGRRWHALSREEQAKYYELARKE 
RQLHMQLYPGWSARDNYVS PS S I PVALHS 


5666 


128 


1181 


CTWWQVNI TtiLD INDNHPTWKDAP Y Y INLVEMTPPDSDVTTWA 
VDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRK I WS VTD CGR P P LKATSS ATVFVNLLDLNDND PTF 
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residue of 
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amino acid 
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Amino acid segment containing signal peptide"*" 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G^Glycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L* Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q-Glutamine, lUArginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QNLPFVAEVLEGI PAGVS I YQWAIDLDEGLNGLVSYRMP VGMP 
MDFLINSSSGVVVTTTELDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEJDVPR\GSGWSG*AARN 
NDVGl^EIiSYFITGGlTVDGKFSVGYRDAVVRTVVGLDRETTAA 
YMLILEAI DNGPVGKRHTGTATVFVTVLDVNDKRPI 1LQSS YV 


5687 


17 


917 


AAPPAPPDG/PPP/PPPAPPTyPGPAAyAPASSCQPRLSAGRAA" 

QGDGGAAAVGHVLW PAVG PVR VNPGLQTPVPR PELLPG P \S SS 

LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 

SGCRMPSTSASE/AAGGQGACTHAKGS BTPPPASPQTSEPAPSP 

LPPHLTGQPGMYSSEAKLPNSFSCLGLAGTGAGI*GTASAHG0X3 

PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 

^PFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 


! 5688 


1 


420 


LTKWDLFGKCYRLLKTGIEHGAMPEQVGVYWYS/CLYDSRKLFF 
* SHN 1 1 RS LL * KVI DDSLGQLPLLR EL LL**LNVIDRC 1 1 LA YV 
LRVEKTFAITYLKNFTVKVDFSLLGE I PLISMAAILKLWI MKID 
DGYIPAVF 


5689 


1504 


3 


HELSGKHI SM VSGNTCNWHPGGHS PGGGGQGB ITS KDRGB I PAL 
IWA/RK?IGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKQ 
KLPRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDSSAS\ 
PQTPGGRGSLEWGLPLYLGPHHDVK*RSDRLG* PP * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 

el*rvppgslgpstqckyeptdkhs\ggadaqlevstagsrstf 

GQELKGPLDAGRLWPGAPSASSSHR*GG*ERARAGAGHRGST*A 
SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 
GSPPPA*ASAGRKGTVSTLGGGLL 


5690 


1424 


58 


PSPPAGVCAA^APLPLIJuVu^RDRRPCSPGAEAAPWQTGaPAID 
GAWRTS VSALRRGATG/ APCSPGAEAAPWQTGGPAI dg\dgelp 
*VRSEEAPRGCGAEGGGPGSGPVRRPGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPOAPPGGPAP 
GHAAALPERTRGVAEPPAWAHAGSDAWRAGR*SQRT*ERARPRH 
PTFQGRAGS\GQPGYQPPNPHPGfSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
S LS LLG P / PGAHNLDTAPQDR * HGP * GDKRGAPG VAGEDPR P P * 
GNF VR * LLLM P/ G VA* RHGTS P 7LGPSLG ENGGQW DS GNLFGTP 
KG*SHPAFTKST*SMEAEKSYWNHPHR\DRGRQGVRINCLRVGE 

SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNS PGLLP 


5691 
5692 • 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGZLSFIED 
VAHRMLATGSCTPEDLCFSLQVMQ*KTGTESWG*RFYIVEQN*S 
GDAPLIFSPYLSLTGNCGFAMLVEITERAMAH\CGSPGGPSLWG 
GVGVYVLLESVPLSYS 




1193 


548 


tqawtraekdrkgsvralrlhlergppt*rgshpl\qsvpciqk 

PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPL 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNL P VMG ATRSNLQ P PR KVA VPG PTR * R DQ DS KQD FS S K PLQS 
VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 


5693 


1258 


1330 


/uji vvtrvK/voi i"w/iv*'Mucj>r4jLiVSKAKJLiUJUSSRPSQNTE?QAP 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PE AI VWRS S R F PLW F P LRCCFW VSG FKDPNP VLRFF 


5694 


3 


1338 


gskeparslhrrgsghkssagkwgsvtlstagalg+kGlhq*wt " 
qrcl \ nnls s e efnas s s lns lpstptas rrns t i vlrtds ekr 

SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
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Amino acid segment containing signal peptide"" 
(A=Alanine, CsCysteine, D=Aspartic Acid, E«= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine r T»Threonine, VaValine, 
W-Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /•possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKS IGSPESTPKNQASH 
PTATKLAEL PPTPLRATAKS FVKP P S LANLDKVNSNS LDLPSS S 
DTTQCI 


5695 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*"WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLS WFS ESEE KAPKKLE YDSGSLKMEPGTS KWRRERPE S 
CDDSS KGGELKKP 1 S LGH PGS L KKG KTP P VA VTS P I THTAQSAL 

KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSK1QKSSGIPV 
KP VNGRKTS LD VSN S AE PG FLA PGARSN IQYRS LPRPAKS S SMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKS IGSPESTPKNQASH 

PTATKLAEL PPTPLRATAKS FVKP PS LANLDKVNSNSLDLPSSS 
DTTQCI 


5696 


3 


1338 

- 


GS KE PARS LH R RGSGH KS S AGKWGS VTLSTAGALG * KQLHQ * WT 
QRCL \ NNL SS EEFNAS S S LNSL PS TPTASR RNS TI VLRTDS EKR 
SLAESGLSWFSESEEKAPXKLEYDSGSLKMBPGTSKWRRERPES 
CDDSS KGGE LKKP I S LGHPGSL KKG KTP PVAVT S PITHTAQS AL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGS FGYKKPPPATGTATVMQTGGSATLSKIQKS SGI PV 
KP VNGR KTS LDVS NSAEPG F LAPGARS N IQYRS LPRPAKS S S MS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKS IGS PESTPKNQASH 

PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALS PPACP S APAPRRS 1 1 SRL FGTS PATEAAP PPPEP VPAA 
QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEEEAEVAAPTKGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGPIAAQMLSFVMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKND5DLFGLGLEEAGPKESSEEGK 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 
5*99 


2 


666 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNKSLGPVSFKDVAVDFT 

QEEWQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 

GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPS.RQTVFI 

ETLI * R /ERGNVPGNTFDVE TN P VP SRKI AYTH SLCNS CER\ G F 

NASSEYISSDGRYARMKADECSGCGKSLLHIKLEKTH?GDQA V E 
FNQ 




2 


1448 


RVRQPPGLWVRRTVPAMQCPAGLSRVPGVAG/DPSLPSFRGPRD 
EAAHRGTIQTARHTRKLYVQGPASGPPLPRVSTQVAI*DEKPLA 
RPS/GRTNAPFPQGQKPAGKAAPGPAAAGRVAMR\PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL+RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS*HLDPNT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VPILFQNPSGALRSRRTEPAGWVPPTRHE+DDG*TAAPASGGAP 
VS TPTWAGT P/ LNAS LGPTDPQG KPGCRP P CALP KPAG PERS A* 
GGSLGCR/SMLPASSGPPPAPGPRRLAAGAHT<5ASAi?rDoaB a a 

GWQPRRPGFAGRAALPGPPHPPSS*RELGGLPGPGW*TLDPLPA 
HPAHPPGS APPWGALGGWAAARAS LPWS PSLCLS FPAVTPVAGL 
FPPGRG 


5700 




597 


NGHKGVWEINlY*RRSNIHKNSKSfiSHLNQDHSFPPPTPNSARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 


410 


IFEKICSDTQEFISPEINPQICSWLlFDKGAK/NHATdKDSLFN 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
LE ENTGENLHDTGLGNVFLDMT P KTQ PT KQK 
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sequence 


Araino acxd segment containing signal peptide 
(A=Alanine, OCysteine, D-Asparzic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine. 
P=Proline, Q=Glutamine, R=Arginine, \ 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASBSSASSDGPHPVITPSR 
ASESSASSDGPHPVIT?SRASESSASSDGLH?VITPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PVITPSRASESSASSDGPHPV1TPSWSPGSDVTLLAEALVTVTN 
I E VINCS ITE I ETTTSS I PGASDTDLI PTEGVKASSTSDPPALP 
DSTEAKPHITEVTASAETEjSTAGTTESAAPHATVGTPIjPTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 
PVSIEAGSAVGKTTSFAGSSASSYSPSEAALKNFTPSETLTMDI 

ttkgpfptsrdplpsvpptttnssrgtnstlakittsakttmkp 
ptatpttartrptt\a*vqvkmevssscg*vwlprktsltpewq 
kg*cssstgnstptrltsrspycvsgeang/psaaarhvpyakr 
gccp*pgppptdcscvtvlrgtqkvpmkgsmskpltpdvatgps 
ltstgvyvwggaspvprgvlgltlahvlcfskekt 


$703 


14 


1117 


hhkdsrsqglprtqec^rpelrpllcpralwpvtrlsyrcpwqa 
pkagigtkakpseshlklhpgwpsldrqgepatlgtgtghcsds 

RILRWHP*HTAAR* PRWRRLPSSHRWTRHLGVLRVQDKS * * VSL 
DPSCRPRFLRTC**YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GS WETAPGS * WCPWL * AARWTG WRTASGAS AGLGRAADRPS AWA 
RRVAGLLPGQGLTVRR*H* TAGAPAS VRS SQGATRSPAPGGDQC 

ACGRGP3SC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


K ~jn a 
C7nc 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNI KDG VCAQ I E KN FARAKWKKAVR VTTIjM KR LRAPK 
QS S TAAAQSAS ATDTAT PG AAGG ATAAAASG ATS APEGDAARAA 

KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP . 


i> / Ub 


23 


562 


GD YE FDSPYWDD I SQAAKDLVTRLMEVEQDQRI TAEEAI SHEWI 

SGNAASDKNI KDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPE 

QS S TAAAQSAS ATDTAT PGAAGGATAAAAS GATS APEGDAARAA 

KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


570 6 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD 
DYVANTDNCSLKDLVRECERRYCAFNNWGSVBEQRQQQAELLAV 
IERLGREREGSFHSNDbFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLIiHYE I FVFLLLCSI 
LFFIIFLF 


~ 5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
M FA I QPGLAEGGQ F1/3D P PPGLCQ PELQ PDSKSN FMASAKDANE 
NWHGM PGRVEPI LRRSS SES PSDNQAFQAPGS PEEGVRS PPEGA 
E I PGAE P E KMGG AGTVCS P LE DNG YAS SSLSI DS RSSSPE PACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGQPRPYLDLPA 
QAS VS RP HDRA* GE AVS LS LS SGDVCGHTDGGGAGS D PQAK PKP 
PRCPFTAMPSPRTKQKVRNKVCliLIAIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVS PASGGPRKEGRQGSGG ♦ AGGGGP \ ARTHADL PCVGFVCS PP 
LLK+SDSPVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG * CS WQPAACCTPRSQ * WAVARSPS RCSRW* RQSGR*RG * S 
S RRRRGP *AAGRSTPAVP + PCS *GGAGRRAYACRTGWG YAPS R * 

LEPSGPTSGSAli+TWZX^WSTTifi* *cqt r*ftTAr"rr , T»T r>or\ce>r>e> *> 

AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGS RCVCTCTGL PC PG I PLSGAS PGGSGETGAGRSHTLK 
AARSRLSPRPGSGSRGSY*SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTS S DH * YTGTRR PWAG PGTRCSTAPSRAAP P VS RCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


ITLCPLPQTEKCLNVVTEAATPLGIYLKARVEAGGLKELEISWG 
LHQIWRWGAWMRAGMGGCRCWGVMAPFAPK/NALSFLVNDCS 
L I HNNV CMAA VF VDRAGE WKLGGLD YM YSAQOUGGG PPRKG I PE 
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to first 
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amino acid 
sequence 


Predicted end 
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location 
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residue of 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid f P= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, Metfethionine, NsAsparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=» Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








LBQYDPPELADSSGRVVREKRSADMWRLGCLIWEVFNGPLPRAA 
ALRNPGK1PKTLVPHYCELVGANPKVKPNPARFLQNCRAPGGFM 
SNRFVETNLFLEE IQI KEPAEKQKFFQELS KSLDAFPEDFCRHK 
VLPQLLTAFE FGNAGAVVLTPLFKVGKFLS AEE YQQK I 1 PVWK 
MPS STDRAMR IRLLQQMEQFIQYLDEPTVNTQI FPHWHGFLDT 
NPAIREQTVKSMLEiLAPKLNEANLNVELMKHFARLQAKDEQGPI 
RCNTTVCLGKIGSYLSASTRHRVLTSAFSRATRDPFAPSRVAGV 
LGFAATHNLYSMNDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS. 
FLS KLES VS ED PTQLE E VE KDVHAAS S PGMGG AAAS WAG WA VTG 
VSSLTSKLIRSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLLGTGLA 
GAKLPGATS * RYTAGQRV 


5710 


1 


562 


I PGST I SCE VELMARMAKT IDS FTQNQTRL WI I DGLDACEQDK 
VLQMLDTVRVLFSKGPFIAIFASDPHIIIKAINQNLNSVPSGFK 
\ LNGHD YMRN I VHLPV FLNS RGL/ RQ/ LQEN FS * LQQQMBTFHA 
QILQG YRKKLTEE FHRTALGR *QNLVARQPS IDG* DAIGFELYV 
CIAIQFNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAKAVLSQQRPSLFHECAFHFFS*SLQRHTINLDQGIF*LIjM 
LSEBRQHLFESS/ 1 WTTPHNLK* / FEIHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSLDISERLKFLLTLDCVDDTLI VLAEEHGCLDIIKELP " 

ETViniJ.NKCLTFHPSKRPTPDELMXDKVFSEVSPLYTPFTKPA 

SLFSSSLRCADLTLPEDISQLCKDINNDYLAERSIEEVYYLWCL 

AGGDLEKELVNKEIIRSKPPICTLPNFLFEDGESFGQGRDRSS/ 

TFR*YHWDIWMPAKK*IERCWGRSILPITLKMTSi»ILPYSNSN 

NELSAAATLPL 1 1 REKDTE YQLNR 1 1 LFDRLLKAY P Y KKNQ I W K 

EAR VD I P P LMRGLTWAALLG VEGAI HAK YDAIDKDT P I PTDRQ I 

EVDI PRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLrVYWQGLD 

SLCAPFLYIiNFNNEALVYACMSAFI P KYbYNFFLKDNSHVI QE Y 

LTVFSQMIAFHDPEIiSNHLNEIGFIPDLYAIPWFLTMFTHVFPL 

HKIFHLW\DTLLLGEFLFPILYWE 


5713 


«34 


284 


PVCAVPVDRWPVLPREDQEGQQb*AKLPRDFRR*FQIIiGPMEGH 
T ACRCS RRG AQVQH LPRED I RAAE * D PHLREVW PG L PTS SATS P 
* RAVLTS PCSHLGS ADAASSHWLCGVS FH 


5714 


212 


613 


WGLGLG PTMSSLGGGS QDAGGSS SSSTNGSGGSGSSGPKAGAAD 
KS AWAAAAPASVADDTPPPERRNKSGI I SE PLNKSLRRSR PLS 
H YSSFG S SGGSGGGSMMGGESADKATAAAAAAS LLANGHDLAAA 
MA 


£71* 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 
QTPPAS KLQGGGGGLQTGWGLHP VPVTAAS PLPRWCLFGAVAK\ 
GLPGP*LCPSGAA/GGLQRGPGLSPLGAAGKVSCLHPPSMVENN 
DSTCHEHHEGILAARVTPVPNSGXPGRVLKPPGRVCRPPHPAAS 
PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 
QIRRKETKPL*RKTPAG\NNYQSNSIPVSQSPQLTVDLLPSAGR 
TQAPSGRGDAGKPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 
YPKTPKQRRWRRPL/LLGPSQ+GSRQSTC* EV\GALGEPVRI PG 
L*PDLSCILSNGSKHRREGLSFPRSLGPGRRGPAGLQSbGCSPT 
PKNTACHSSGHVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 
VWRWIPbE*LGLSRETGQATRRGLVWISPGRAAAACVACAQALE 
EGPLRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 
GLT/GVPGTDPKRGGRKPG^SGQETQGPTVWSGPES PbQPKP * E 
RQE/VGAGASSGVGLSRGRAGGPSSAWBVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCRBACAAASPGLDSAAEPHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL*SGFFTIIVGGYSCCMPLKT 


5717 


44 


1489 


L PTEALRES E W VSE YG KCG PRGL VPEGES TS PL PSS VDTEDS LD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, Glycine, 
H^Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline f Q^Glutaroine, R=Arginine, 
S=Serinc, TVThreonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GDS LG AR Pli liP YGLS DDE SGGGRALS AES E VE EPARG PGSARGB 
RPGPACQLCGGPTGEGPCOGAGGPGGGPLLPPRLIjYSCRLCTFV 
SHYSSHLKRHMG/THSGEKPFRCGRCPYASAQLVNLTRHTRTHTG 
EKP YRC PHC P FACS SLGNtiRRHQRTHAGP PTP PCPTCG FRCCTP 
RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG 
Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGBKPYKCPL 
CPYACGNIANLKRHGRIHSGDKPFRCSLCNYSCNQSMNLIRHM 


5718 


120 


j 284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S**STADPLHL 


5719 


48 


428 


ELNNG PFQM F iiCNGGNLAVTGS WADRS PLHEAASQG RLLALRTL 
LSQGYNVNAVTLDHVTPLHEACLGDHVACAR7LLEAGANVNAIT 
IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 


1051 


LQAFRNASEvPMVLVGTQDAISAA\NPRVYRRTSRARKIiSTDLk 
\RCT\YYE\TCGGTYGLQMWSVSFQDVAQKWAL\RKKQQ\LAI 
GPCK\SLPN\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRELRIETIAASSTPTPIRKQSKRRSNIFTS 
RKGADP\DREKKAAGCKVDSIGSGRAIPIKQGIOLKRSGKSLNK 
EWKKKYVTIiCDNGLLTYHPSIiHDYMQNIHGKEIDI,LRTTVKVPG 
KRL PRATPATA PGTS PRANGLS VE RS NTQLGGGTG A PHS AS S AS 

LHSERPLSSSAWAGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
QHPASG 


5721 


97 


492 


RHSSPCCSLRRTERSSNAAVST/TTVQQFKRFIENYRRHIGCVA 
VFYAI AGGL F LE RAY Y YAFAAHHTG I TDTTRVG 1 1 L SRG TAAS I 
SFMFSYrLLTMCRNLITFLRETFLNRYVPFDAAVDFHRLlASTA 


5722 


68 


1043 


VALDVLAGS S PGGGMAGALLG PR VHG I RAVLRVARGG VQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRJLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAG LLG RQGQGGRGAERERAALQARRGRR P G PE PDQS CG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETLRLGRGAPLP\PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5723 


68 


1043 


VALDVLAGSS PGGGMAGALLG PR VHd I RAVI^VARGG VQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 

qesvpaststargprrvsrrlppqhpgprgrrrrpgagvgaprr 

GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETLRLGRGAPLP\PRAERGG 

rpkqaeqqq\pkrptppargpqssgdpamlpqraglhtgglagt 
ksstreipemi 


5724 


3 


1841 


FTNEAPPAPbPDASASPLSPHRRAKSLDRRSTEPSVTPDLLNFK " 

KGWLTKQYEDGK)WKKHWFALADQSLRYYRDSVAEBAADLDGEID 

LSACYDVTEYPVQRNYGFQIHTKEGEFTLSAMTSGIRRNWIQTI 

MKHVHPTTAPDVTSSLPEEKNKSSCSFETCPRPTEKQEAELGEP 

DPEQKRSRARE\RRREGRSKTFDWAEFRPIQQALAQERVGGVGP 

ADTH\DPWRPEAEHGELERERARRREERRKRFGMLt)ATDGPGTE 

DAALRMEVDRSPGLPMSDJ.KTHNVHVEIEQRWHQVETTPLREEK 

QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSQKEASDLLEQ 

NRLLQDQLR VALG R EQ.S AREG YVLQAT CERG FAAMEETHQKKI E 

DLQRQHQRELEKLREEXDRLLAEETAATISAIEAMKNAHREEME 
RELEKSQRSQISSVNSDVEALRRQYLEELQSVQRELEVLSEGYS 
QKCLENAHLAQALEAERQALRQCQRENQELNAHNQELNNRLAAE 
1TRLRTLLTGDGGGEATGSPLAQGKDAYELEVPSGARPCLTQLC 
TQEPQGSAAWPLS Y R WGGTDLRQQESQGPGRS KS PEGGEEQ 


5725 


3 


1049 


VNGHSEETSQ^PNRTEPHDSDCSVDLGISKSTEDLSPQKSCPVG 
S WKSHS I TNMEIGGL KI YD I LS DN\DLS SHLQ PLK/ FTS AVEG 
KNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
PE/GAKYNKRPHKWAHNLHLKYMVLHSI ISNTVAV\RSQRHFVA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine. D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Ieoleucine, K=Lysine, 
L°Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQTKS PNRPCQFSSSAPS/ VDQRAQ/ INQS YAKHSANMNFSNHN 
NVRANTAYHLHQilLGPARHGEMWAI S PNDRLI PAVTRS T I QRQS 
SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQRPLSART YS IDGPNASRPQSARPS INBI PERTMSVSDFNYSR 
TSP 


5726 


2 


486 


j SRSbSMWKNSGLPASSHSSKLPVTVGFSGCVKRLRLHGRPbGAP 
1 TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
j QGS PGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GLI FHLGQARTPPYLQLOVTEKQVLLRADDG 


5727 


21 


221 


RP ILI LKETRRLPWATGYAE VINAGKSTHNEDQASCEVLTVKKK 
AG AVTST PNRNS S KRRS S L ?NGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG/LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GG PAGAGGDAG/ LPGRCP S AP WRAGS RPAAS CPDWI PGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
EPPQyPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHLAEGGA 
KGS PRRLADPQDLPAGQMS LAPP FPP VAAVIRSNK 


5729 


1 


1525 


AGGAREVLTLgLGHFAGFVGAHKWNQQDAALGRATDSKE'^PGEL 
CPDVLYRTGRTLkGOETYTPPT TT MnT W^CT OCT fttrw-ir vnnv 

QLDAAIAWQGKLTTHKBELYPKNPYLQDFLSAEGVLSSDGVWRV 
KS I PNGKGSSPLPTATTPKPLIPTEAS IRVWSDFLRVHLHPRSI 
CMIQKYNHDGEAGRLEAFGQGESVLKEPKYQEELEDRLHFYVEE 
CDYLQGFQlLCDLHDGFSGVGAKAAETiTinnPY<:«pr:TT'rwnT t r> 
GP YHRGEAQRN I YR LLNT AFGL VHLTAHSS LVC PLS LGG SLGLR 
PEPPVS FPYLH YDATLPFHCSAILATALI7TVTCS\ YRLCSS PVS 
MVHL\ADMLSFCGKKWTAGAI I PFPLAPGQSLPDSLMQFGGAT 
P WT PLS ACG E P SGTRCFAQS WLRG I DRACHTSQLTPGT P P PS A 
LHACTTGEEILAQYLQQQQPGVMSSSHLLLTPCRVAPPYPHLFS 
SCS PPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERLLANQQVFHISCFRCSYCNNK 
LSLGTYASLHGR1YCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFHERPRNFENCGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGELIPKDSCYMRKPPRRPkKRRQG/CALPQGCLTFKDVAI 
EFSLEEWKCLNPAQRALYRAVMLENYRNLESVGLTSKDSWYMRK 
KPGRGRG KQRRQE WFFLR VY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVT.VTLVCGFTSFSFSLPLYLCGCLRF 
PERTCSQLQQADWAPDFGPSS FVPSVfGATATGARKFLI AFNI \N 
LLGTKEQAHR1ALNLREQGRGKDQPGRLKKVQGIGWYLDEKNLA 
QVSTNLLDFEVTALHTVYE ETCREAQELS LPWGSQLVGLVPLK 
ALLDAA 


5733 


1 


4 60 


PALQEVNANALAWGKQ YENDARTLFE FTSGVNDTES P 1 1 YRDES 
MRTACS PDGLCSDGNGLELKCPFTSRDFMKFRLGGFEAI KSAYM 
AQVQYSMWVTRKNAWY FAN YDPRMKREGLHYVVI ERDEKYM\AS 
FDEI\VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLIPAYSKNRAYAIFr'lVFTVI 
GSLFLMNLLTAI I YSQFRG YLMKSLQTSLFRRRLGTRAAFB VLS 
SMVGEGGAFPQAVGVKPQNLLQVLQKVQLDSSHKQAMMEKVRSY 
GS VLLS AE EFQKL FNELDRS WKEH P PRP EYQS P FLQSAOFL FG 
HYYFD YLGNLI ALANLVS I CVFLVLDADVLPAERDD FI LGILNC 
VFI VY YLLEMLLKVFALGLRG YLS YPS NV FDGLLT WLLVLE I S 
TL WCTDCHTQAGGRRWW/RLLS LWDMTRMLNMLI VFR FLRI I P 
SMKPMAWASTVLGL 


5735 
5736 


2 

I 


$40 
382 


FFTPCVARAFNF PDQATVKKAAYSLPRVGGGTS CGLPQARRISL " 

ATPRQLYK/SSNHTQRWQRREISNFEYLMFLNT1AGRTYNDLNQ 

YPVFPWVLTNYESEELDLTLPGNFRDLSKPIGALNPKRAVFYAE 

RYBTWBDDQSPPYHYNTHYSTATSTLSWLVRIVSIFIELACLWY 

LKILT 

GTRPST K KSG YS PQQ VA VI HCKGHQKENTAVAHSNQKADSAAQV j 
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SEQ 
ID 
NO: 


fteaiccea 
beginning 
nucl#oh {()p 
location 
corresponding 
to first 
airJlno acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

C 0/V> 1 Art/* A 

Sequence 


| Amino acid segment containing signal peptide 
1 (A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine. T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLSVTPPNLLPTVSFPQPDLPDNPVYSTTTEKIASDLRANKN 
QES**ILPDSGIFIP+T*TSYLQSTTHLRRAKLPQLLRR 


5737 


290 


104 1 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRANLGPCRRKR 
LQTLMR LAAG FQYS S HKD PS IiS AKE KKTD YHNEARG P WPGWVG * 
RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLECGTPAPLQLB I P PQPRGHPAP I PTGQAGPRDSG PGAS P * V 
ETR PLTDGR R * PGVR P VG WT P AH PAGTLR PRGAVE PS VSACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


1 e 


440 


OTLSLNCTLPETLPMTPSF*LSFL*FPGLAilAKSIPTKTYSNEV 

vtlwyrppdillgstdystqidmw*govewogpcgkggglvtt • 

ATQPAAFIiFTVPSLPRGVGCI FYEMATGRPLFPGSTVEEQLHF I 
FR I LSEEAWALCAVETHR 


5739 


1 


1222 


S FQ RRG I R WNVHTXiH PHP RA V W AG IGRGHG S * PlVUG RARAP ALC 
FPTLLEFLRSLEPDLPALRAMGLHLWAAGPGTHPAGISDUoAEV 
SAEVDGPVPGYLSSPQSITDTCLYIFTSGTTGLPKAARISHLKI 
LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VL KS KFS AGQ FWEDCQQHR VTVFQ Y I GELCR YL VNQ PP SKAERG 
HKVRLAVGSGJjRPDTWERFVRRFG PLQVLETYGLTEGNVATINY 
TGQRGAVGRASWLYKHIFPFSLIRYDVTTGEPIRDPQGHCMATS 
PGEPGLLVAPVSQQSPFLGYAGGPELAQGKLLKDVFRPGDVFPN 
TRDLLVCDDQGFLRFHDRTGDPFRWKGENVATTEVAEVFEALDF 
LQEVMVYGVTV 


5740 


265 


231 


PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
Y\nfERWN*NISRMVHALEQKRHPAGLSSSMALQLNPCLGMLMA 
LQS E LHKLYDE ETQSWVS G S ACGG YP 


5741 


1 


650 


PRKTMRRGVLMTLLQQSAMTLPIjWIGKPGDRPPPLCGAIPASGD 
YVARPGDKVAARVKAVDGDEQWIIAEWSYSHATNKYEVDDIDE 
EGKERHTLSRRRVIPLPQWKANPETDPEALFQKEQLVLALYPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTSYADGYSPPLNVAQRYW 
ACKEPKKK* CRLADSPSPNDTGQDSRGRAGIKHI PPLKKK 


5742 


2 


362 j 


TQSVKEILKRNPNWLTDKDGNTALMIASKEGHTEIVQDLLDAG 
TYVNI PDRSGDTVLIGAVRGGHVE I VRALLQKYADIDI RGQDNK 
TALYWAVEKGNATM VRDI LQCNPDTE I CTKDG 


5743 


2 


415 j 


GKTPEGIDAIEEIEIDLEETEREISPQEKfGLEEVKPLGEMQTDL 
KATGR E I S P RE KT P E VI DATEE IDKDLE ETGRR E I S PEENG PEE 

VKPVDEMETDLKTTGREGSSREKTREVIDAAEVIETDLEETERE 
ISPQE 


5744 
5745 


3 


703 

599 -4 


TRRTTTTS PTTTRQMTTT P AALPTT WTTPDLTTGTPLQMTTI A 
VFTTANTCIaSIiTPSTLPEEATGLLTPE PS KEG PI LTAESETVLP 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGI PMS MKNEMP ISQLLM 1 1 AP 
SI/3FVLFALFVAFLLRGKLMET YCSQKHTRLDYIGDS KNVLNDV 
QHGR EDE DGLFTL | 




1400 




GECSRFVNLMKHSKKTYDSFQDELEDYIKVQKARGLEPKTCFRKM 
XGDYLE TCG Y KG E VNS RP T YRM FDQRL PS ET IQTY PRSCN I PQT 
VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 
YICGSHGVEHRVYKHFSSDNSTSTHQASHKQIHQKRKRHPEEGR 
EKSEEERSKHKRKKSCEEIDIiDKHKSIQRKKTEVEIETVHVSTE 
KLKNRKEKXSRDWSKJCEERKRTKKKKEQGQERTEEEMLWDQS I 


5746 


3 


821 


S FASGRLTPSSPAFDGELDIjORYSNGPAVQAWQT^MnzL-vyoTijgco 1 ■ 
RAGERRFPC P VCXJ KR FR FNS I LALHLRTHQP ERPRS P AARLLIjE 
LEERALLREARLGRARSSGGMQATPATEGIiARPQAPSSSAFRCP 

yckgkfrtsaererhlhilhrpwkcglcsfgssqeeellhhslt 
ahgaperplaatsaapppqpqpqpppqpeprsvpqpepepqper 
eatptpapaapeeppappefrcqvcgqsftqsmflkghmrkhka 
sfdhacpv 


5747 


2 


1326 

jj 


drhvetlcihflgpstgstaktggrnwlktgnclygntcrfvhg~ 

psprgkgyssnyrrsperptgdlreriknkrqdvdtepqkrnte 

sssspvrkessrgrhrekedikitkertpeseeenvewetnrdd 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 

com lonrD 


Ammo acid segment containing signal peptide" 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methion.i.ne, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y-Tyrosine, X= Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDINYPYVHELSLEMKRQKIQRELMKbEQENMEKREEI I IK " 

KEVSPBWRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 

AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 

KYKVKDRIEEKTRDGKDRGRDFERQREKRDKPRSTSPAGQHHSP 

ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 

ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 

ERREDTRGKRDRE KDSREEREYEQDQSSS RDHRD DREPR DG RDR 

RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGLQFSCYSSLKHLYKWAIPAEG 
KKNENLQNIiLCGSGAGVISKIXTYPLDLFKKRLQVGGPEHARAA 
FGQVRR Y KG LMD CAKQVLQ KEGALG F F KGLS PS LLKAALS TG FM 
FFSYEFFCNVFKCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS " 

SASSTYSSAEERMQSEQIRKLRRELESSQEKVATLTSQLSANAN 

LVAAFEQSLVNMTSRLRHLAETAEEKDTEI»LDLRETIDFLKKKN 

SEAGAVIQGALNASETTPKELRIKRQNSSDSISSLNSITSHSSI 

GSSKDADA 


"5750 


22 


866 


IFISI CLWNAHLCFLLLPKDCIDQVMKIjQNLFVDDSGRYIiAIQF ™ 

I ILE WAYVFL YY YE YRKAKDQLDIAKD I SQLQ I DLTGALG KRTRF 

QENYVAQLILDVRREGDVLSNCEFTPAPTPQEHIjTKNLELNDDT 

ILNDI KLADCEOFQMPDLCAEEIAI ILGICTNFQKNNPVHTLTE 

VELLAFTSCLLSQPKFWAIQTSALILRTKLEKGSTRRVERAMRQ 

TQALADQFEDKTTSVLERLKIFYGCQVPPHWAIQRQLASLLFEL 

GCTSSALQIFEKLEMWE 


5751 


i 

3 


751 


SCGSALRAWRCGAAALATFPAPALPGLMYRALYAFRSAEPNALA 
FAAGETFIiVLERSSAHWWLAARARSGETGYVPPAYLRRLQGLEQ 
DVLOAIDRAIEAVHNTAMRDGGKYSLEQRGVLQKLIHHRKETLS 
RRGPSASS VAVMTS STSDHHLDAAAARQPNGVCRAGFERQHSLP 
SSEHLGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 

> 1 


471 


GPVCGVGLSVAWAGPWRGPVHSVGGGGRAALHGAELPCLSGAAT 
VEREMELRKKNEMLRVETEARARAKAERENADIIREQIRLKASE 
HRQTVLESIRTAGTLFGEGFRAFVTDRDKVTATVNIFIKQGWQV 
AERQHVGASWS PRSCPCRLCTAL 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRN I YTPRTGHRIRIOiDQ I QS GGN YVAG 
GQEAFKKLNYLDI GE I KKRPMEWNTBVKP VIHSRINVS ARFRK 
PLQEPCTIFLIANGDLINPASRLUPRKTLNQWDHVbQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 . ~ 


331 


TLVHWEFAGEHAEAIASREQEVLQGWKELLSACEDARLHVSST 
ADALRFHSQVRDLLSWMDGIASQIGAADKPRCPSSLLGLPASPW 
W PTFAT PS PLTA P FSM E 


5755 


3 


888 


LGDQ F Y KE A I EHCRS YNSRLCAER S VRL PFLDSQTG VAQNNC Y I ' ~ 

WMEKRHRGPGLAPGQLYTYPARCWRKKRRLHPPEDPKLRLLEIK 

PEVELPLKKDGFTSESTTLEALLRGEGVEKKVDAREEESIQEIQ 

RVLEND ENVEEGNEEE DLEE DI P KRKNRTRGRARG S AGGRRRHD 

AASQEDHDKPYVCDICGKRYKNRPGLSYHYAHTHLASEEGDEAQ 

DQETRSPPNHRNENHRPQKGPDGTVIPNNYCDFCLGGSNMNKKS 

G RPE EL VS CADCGRS AHLGG EGR KE KEAAA 


5756 


3 


621 


SSKLQALFAHPLYNVPEEPPLIjGAEDSLLASQEALRYYRRKVAR 
WNRRHKMY REQMNLTS LDPPLQLRLEAS WVQ FHLG I NRHGL YS R 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRI LDFRRVPPTVGR I VNVTKEI L 


5757 


3 


473 


YKDALLLPDNHRQWFENGTLKLTDVQKGMDEGEYIiCSVLIQPQ 
LSISQSVHVAVKVPPLIQPFEFPPASIGQLLYIPCWSSGDMPI 
RITWRKDGQVI I SGSGVTI ES KEFMSSLQ IS SVSLKHNGN YTC I 
ASNAAATVS RERQL I VR VP PR FW 


575B 


1 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRLAVSRTDGTVEIYNLSANYFOEKFFPGHESRATSALC 
WAEGQRLFSAGLNGE IMEYDLQALNI KYAMDAFGGP I WSMAAS P 



367 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing siqnal oeotirlf* " 
(A=Alanxne, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5759 


2 


1240 


SGSQLLVGCEDGSVKJjFQITPDKIPV 

OWAAir AGC^» v V 1 1, i FHMSDLPS YTTNGTVHVVVNNQIGFTTDPR 
NARSS PYPTDVARWNAP 1FHVNADDPEAVI YVCS VAAEWRNTF 
NKDVGADLVCYRRRGHNEMDEPMFTQPLMYKQIHRQVPVLKKYA 
DKLIAEGTVTLQEFEEEIAKYDRICEEAYGRSKDKKILHIKHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDMLTHIGSVASSVPLEDF 
KI HTGLSR I LRGRADMTKNRT VDNALAE YNAFGS LLKEG I HVRL 
NGQD VERGT FS HRKHVLHDQEVDR RTCVPMNH UA PDQA P YTVCN 
S S LSEYGVLG FELG YAMAS PN AL VLWEAQ FGDFHNTAQ C 1 1 DQF 

ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQMSNDD 
SDAYPAFTKDF3VSQL 


576C 


1 


1221 


VKUITSDSLSLSWTVPEGQFDKFLVQFKNGDGQPKAVRVPGHED 
GVTISGLEPDHKYKMNLYGFHGGQRVGPVSAVGLTAPGKDEEMA 
PASTEPPTPEPPIKPRLEELTVTDATPDSLSLSWTVPEGQFDHF 
LVQYKNGDGOPKATRVPGHEDRVTISGLEPDNKYKMNLYGFHGG 
CRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SS PDSLSLS WTVPQGR FDS FTVQ YKDRDGRPQWRVGGEES EVT 
VGGLEPGRKYKI4HLYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 
TEPGTEAPEPPEEPLLGELTVTGSSPD3LSLSWTVPQGRFDSFT 

VQYKDRDGRPQAVRVGGQESKVTVRGIiEPGRKYKMHLYGLHEGR 
RLGPVSAIGVT 


5761 


3 


JL275 


SCDMAEAAAIiWIRGPGFGCKAVRCASGRCTVRD^ikRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQIEKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAASSKMVSAEISENRKRQWPTKSQTDRGASAGKRRCFWIjGM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGSQRARWNTDHGSPEQLQIPVTDSGRHILEDSCAELGESK 
EHMESRIWrETEETQEKKAES KEPI BEE PTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIDKETIDLLAFTSV 
AELEIiLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTGQTPL^guGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MS S EE AANGKKS HWAE LE I S G KVRS LS AS L WS LTHLTALHLS DN 
SLSRIPSDIAKLHNLVYLDLSSNKIR 


5763 
"'5764 


3 


429 


LDKDTGIj I MLIARLD YEL I QR FTLT I IARDGGGEE TTGRVR I N V 
LDVNDNVPTFQKDAYVGALRENEPSVTQLVRLRATDEDSPPNNQ 

ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYL 
TVMAMDAGN 




19 


441 


VCARACGEMRQLLRPIDRQRYDENEDLSDVEEIVSVRGFSLEEK 

bKbyjjyQGDFVHAMEGKDFNYEYVQREALRVPLIFREKDGLGIK 

MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKL 


5765 


3 


825 


QKILRLNNSHQPPTSSSNSKDCGGPASSGAGATAAiiADGLKFAS 
VQAS APQGNS HKETS KS KVKRS KTS KDANKSLPSAALYG I PE I S 
S7GKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPSGGHLYGFGAKSNGGGASPFHCGGTGSGSVAAA 

^tv&i^Afc'UbuJ^GNSMIjVKKEEEEEESHRRIKKLKTEKVDPLF 
TVPAPPPHV 


5766 " 


1608 


663 


S GL F S VD P AS SQ AMELS D VTL t EG VGN EVM WAG WVL I LALVL 
AWLS TYVADSGS NQLLG A I VSAGDTS VLHLGHVDHLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEPSLEHLLD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLM VP VFWLLG WWYFRINYRQFFTAPATVS LVGVTVFFS FI>V 
FGMYGR 


5767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYLDWRAI^KRKRMKANIKLVG 
SGFPLPSSDLDDSLTEEIDBKIGFRWDANFDWQKVADFRDAGGS 
LTEVKVEEEERDPQSPEFEIEEEEEMLSSVIPDSRRENELPDFP 
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NO: 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


c>c 3 u,t - ii *- t.vjnuaijiing signal peptide 
(A=Alanine, C=Cysteine f D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine N^A^-n^ r-^rr-i n*» 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown / +=stop 
Codon, /^possible nucleotide deletion, 
\=posaible nucleotide insertion) 








HIDEFFTLNSTPSHSAYDEiPriLLVNIEKQKLELEKRRLDIEAER 
LQVE KERLQ I E KEPXR H LDMEHERLQLEKERLQI ERE KLRLQI V 
NSEKPSLENELGQGBKSMLQPQDIETEKLKLERERLQLEKDRLQ 
FLK?2SEKLQ I EKERLQVEKDRLR IOKEGHLO 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA 
A^/AKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 
S RVTS LANL I P P VKAT P LKRFSQTLQRS I S FRS ES R PD I LAPR P 
WSRNAAPSSTKRRDSKLWSETFDVC 


5769 
[ 5770 


38 


667 


TK7KKGVKEKATDQSVKAFAEHCPELQYVGFMGCSVTSKGVIHL 
TKLRNLSSLDLRHITELDNETAMEIVfG^CKNLISLNLCliNWIIN 
DRCVEVIAKEGQNLKELyLVSCKITDyALIAIGRYSMTIETVDV 
GWCKEITDQGATLIAQSSKSLRYLGLMRCDKVNEVTVEQLVQQY 
PH I TFSTVLQDC KRTLERA YQMG WT PNMSAAS S 




1 


484 


DSRRYDVKTitKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTIiA 
FASOLKKTSLSLTPDVPPADT.QRVnowT ^;cMTj«Dcv-\o7Vi^irvTc«*T 

AKGGRLLLADDMGliGKTIQAICIAAFYRKEWPLLWVPSSVRFT 
WBQAFLRWLPSLSPDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQUTFEACYSGTSTPS 
FHGSHCSGSDHSSLGLEQLQDYMVTLRSKLGPLEIQQFAMLLRE 
YRXiGLPIQDYCTGLLKLYGDRRKFliLLGMRPFIPDQDIGYFEGF 
LEG VG IREGGX LTD^ PAR T K"R CMQ CTC A C aud e vtv^ a * /-»t» nn* ^ 

AFHRLLADITHDIB 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
ILTFSNLVTCSAIYHLPVFPEREPGCSMRDLRVA 


5773 


2 


723 


PR VRS KHN FC FMEMNTRLQ VEHPVTEM ITGTDLVEWQLRI AAGE 
niruijyaDi i -uywrrt^u' iiAKi i AHUPSNNFMPVAGPLVHLSTPRA 
DPS TR I ETGVRQGDEVSVHYDPM I AKL WWAADRQAALTKLR YS 
U?OYNIVGLHTWIDFLLNLSGHPEFEAG]yVHTDFIPQHHKQLLL 

SRKAAAKESLCQAALGLILKEKAMTDTFTLQAHDQFSPFSSSSG 
RRLN I S YTRNMTLKDGKNS K 


5774 


2 


592 


». yuudhiiw vnv*uujntiwr KKrtvr oAJJol\i 1FLVSGDFVKVYST 
VTEECVHILHGHRNLVTGIQLNPNNHLQLYSCSLDGTIKLWDYI 
DGI hi KTFI VGCKLHALFTLAQAEDS VFVI VNKE KPDI FQLVS V 
KLPKSSSQEVBAKELSFVLDYINQSPKCIAFGNEGVYVAAVREF 
YLS VYFFKKETTS RVTLSSS 


577S 


3 


538 


SSGCCDPAAPSSLAEAATMPVSKCPKKSBSLWKGWDRKAQRNGL " 
RSQ VYAVNGDYYVG E WKDNVKHGKGTQVWKKKGAI YEGDWKFG K 
RDGYGTLSliPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 

EGDWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLRLSQNP 
RP 


5776 


2 


484 


R LPQDC VCQNLS ES LGTLCPS KGLL FVPP D t DRR.TVELRLG GN F 
1 1 HI S RQD FANMTGLVDLTLSRNTI SHI Q P FS FLDL E S LRS LHIj 
DSNRLPSIiGEDTLRGLVNLQHLIVNKNQIjGGIADEAFEDFLLTL 
EDLDLS YNNLHGPAVGLRGDAWVQPS TS 


5777 


2 


949 


gqdpepgqdlfqperevdpswgrgreprlgklrfqndhlsvlkq 
vxkleqalkdgsagldpqlpgtcysphcppdkaeagstlpenlg 
ggsgsevsqrvhpsdlegreptpelvedrkgscrrpwdrslenv 
yrgsegsptkpfinplpkprrtfkhagegdkdgkpgigfrkekr 
nlpplpslpppplpsspppssvnrrlwtgrqkssadhrksyefe 
d llqss s es s rvdwyaqtklg ltrtls e envyed i ld p pmkenp 
yedielhgrclgkkcvlnfpasptssipdtltkqslskpaffrq 

NSERRNV 


5778 


1 


1210 


qrrqsvsrlllpvflleppaepglepppeeeggepagvasepgs 
ggpcwlqleevpgpgplggggplrspssyssdelspgepltspp 
waplgaperpehllnrvleriaggatrdsaasdillddiviiths 
lflptekfloelhqyfvraggmegpeglgrxqaclamllhfldt 
yqgdlqeeegaghiikdlyllimkdeslyqglredtlrlhqlve 
tvelkipeenqppskqvkplfrhfrridsclqtrvafrgsdeif 
crvympdhsyvtirsrlsasvqdilgsvteklqyseepagreds 
l t lvavsssgekvllqptedcvftalginshlfactrdsyealv 
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| SEQ 
ID 
1 wo . 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G-Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 

Leucine, M-Methioninc, N=Asparagine, 
P= Proline, Q»Glutamine, R=Arginine, 
S-Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVS^UTEIHRVEPEDVANHLTAPHWELPRCVHSLEFV 
DYVFHGE 


5779 


138 


1571 


EAVQVLIKHSADVNARDKNWQTPLHVAAANKAVKCABVI IPLLS 
SVNVSDRGGRTAIjHHAAIjNGHVEWVNLLLAKGANINAFDKXDRR 
ALHWAAYMGHLDWALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NWKHLLNLG VE I DE INV YGNTALH I ACYNGQDAWNELI DYGA 
NVNQPNNKG FTPLHFAAASTHGALCLEIiLVNNGADVNIQS KDGK 
SPLHMTAVHGRFTRSQTLIQNGGEIDCVDKDGNTPLHVAARYGH 
EIJUlNTLITSGADTAKCGItlSMFPI/HLAALNAHSDCCRKLLSSG 
QKYSIV3LFSNEHVLSAGFEIDTPDKFGRTCLHAAAAGGNVECI 
KLLQS S G AD FHKKDKCGRT PLH YAAANCH FHC I ETLVTTGANVN 
ETDDWGXTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCLEFLLQNDANPS I RDKEG YNS I H YAAAYGHRQCLE LLLE 
RTNSGF3ESDSGATKSPLHLAVSEMP 


S780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEEKSEPVS 
EIETSWKGSHFPVGWPPRAKSPTPESSTIASYVTLRK?KKMM 
DLRTER PRSAVEQLCIiAESTRPRMTVE BQMER I RRHCQACLREK 
KKqLNVIGASDQSPLQSPSNLRDNP 


i 5781 


19 


941 


RGSLGGHPWRPPWRAASQGCLPVSFVTGPHQERAYGGRGPGGAF 

PAPPVSGTCPPDLIYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 

QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 

VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 

QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEBHGRQS\QAPPPLP 

VAPSRTCGGC*TWDPALLVSP/PQGDSTPELPAP\QQPTGGPSR 

CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 


| 1237 


DRSMMSMAAuaYTDSYTDTYTEAYMVPPLPPEEPPTMPPLPPEE 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPESSITLTPVESAWABEHEWPBRPVTCKV3ETPAMSAEPT 
VLASEPPVMSETAETFDSMRASaHVASEVSTSLLVPAVTTPVLA 
ESILEPPAMAAPESSAMAVLESSAVTVLESSTVTVLESSTVTVL 
EPSWTVPEPPWAEPDYVTIPVPWSAXEPSVPVLEPAVSVT.Q 
PSMIVSEPSVSVOESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 
ILESSIMSSHVMKGINLSSGDQNLAPEIGMQEIALHSGEEPHAE 
EHbKGDFYESEHGINIDLNINNHLIAKEMEHNTVCAAGTSPVGE 
IGEEKILPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 
ATG\TSKGI3FTTASTLSLVNKYDVDLSLTTQDTEHDMLISTSP 

sggseadiegplpakdihldlpsninlvssdtneplpvkrd\dq 
tlaali\sl:<essggekbvppps*rehlpdsgfsaniedinkad 

LVRPVSSPRTWNVLPSPRAGL\EGP\LLASDFGPVQNLYSSPW 
\SSMP\ERASGS\SSGEKGG\YEIFVKVKDTHEKSKKNiCNRDKG 
EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 
HRS\QTRSRSRS/RDRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 
RSKSRERKRKRSSSRDMRKTVRARSRTPSRRSRSKTPSRRRRSR 
SVGRRRSFSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 
RTPSRRSRTPSRRRRSRSWRRRS FS IS PVRLRRSRTPLRRR FS 
RS P I RRKRS RS SERGRS PKRLT DLDKAQLLE I AKANAAAMCAKA 
GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 
DDDVIVNKPHVSDEEEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PTPPKSQVTJUTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
isxruuwvr a^wijf ijti'VL/lis lAMSERAIiAQKRLSENAFDLEAMSM 
LNRAQERIDAWAQLNS I PGQFTGS TG VQVLTQ EQLANTG AQAW1 
KKDQFLRAAPVTGGMGAVLMRKMGWREGEGLGKNKEGNKEP1LV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQ P PE FIiL VHDS G PDHRKH FliFR VL 3 NGSA YQPNCM FFLNR 
Y 


5783 


1693 


698 

1 


DSGLRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
QKLGFGTGVNVYLMKRSPRGLSHSPWAVKKINPICNDHYRSVYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSL 
TOLIEE/PI*SQ/PKIIiFQQP/LILKVALNMARGLKYLHQEKKL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M^Methionine, N-Aeparagine, 
P«Proline f Q«Glutaraine, R«Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDIKSSNWIKGDFETIKICDVGVSLPLDENMTVTDPEACYI 
GTEPWKPKEAVEENGVITD KADI FAFGLTLWEMMTLS I PH INLS 
NDDDDE D KTFDES D FDDE AYY AALGTRPP INME ELDE S YQ KV I E 
LFS VCTNEDPKDR PS AAHI VEALSTDV 


5784 


2669 


1388 


PR VRPR VRTDHN Y Y I SRI YGPS DS ASR DL WVNI DQM E KD KV K IH ( 
GILSNTHRQAARVNLSFDFPFYGHFLREITVATGGFIYTGEWH 
RMLTATQYIAPLMANFDPSVSRNSTVRYFDNGTALVVQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKE I PVLVTQ I SSTNHPVKV 
GLS DAF WVHRI QQ I PNVRRRT I YE YH RVE LQiMS KI TN I S AVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCFEESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E / D AVTSQF PTSLPTEDDTK I ALHLKDNGAS TDDS AA E K KGGTL 
HAGLI VG I LI LVL I VATAILVTVYMYHH PTSAAS I FFI ERR PSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC j 


5785 


2669 


1388 


prvrprvrtijhnyyisriygpsdsasrdlwvnidqmekdkvkihH 

G I LSNTHRQAAR VNLS FD FP F YG HFLRE I TVATGG F 1 YTGE WH 
RMLT ATQY I AP LMANFD PS VSRNS TVR Y FDNGTAL VVQ WDH VHL 
QDNYNLGS FTFQATLLMDGRI I FGYKEIPVLVTQISSTNHPVKV 
GLSDAFWVHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGFCVSSQIGFNCSMCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKXGGTL 
HAGLI VGI LI LVLI VATAI LVTVYMYHH PTSAAS I FFIERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC | 


Site 


2532 


1674 


S YKLPAAERRASS CSQ P PTPTRRRWPAPGRTS RGHRPQM * SGTpH 
APRPP ARSTVSPASPLPKPRAGRCGSRERSACSTFRPC* SLN*M 
S*H*KRNLSQRSSSMSRRPLSCARPHR**RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSGRSTR* PERMS FRP\SPPGNPAIP 
SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP * WRPSGRLSTV* RA 
TGGSTATAPPKRFPRNVfNPMMAB j 


5787 


2 


1460 


^saasvtsladevncp\icqgtlkeagslsncg/hknfcraclH 

T\RYCEIP\GPD\LEESP\TCP\LCKEPFRP\GSFRPNWQLANV 
VENIERLQLVSTLGLGEEDVCQEHGEKIYFFCEDDEMQLCWCR 
EAGEHATHTMRFLEDAAVAPYREQ IHKCLKCLI KEREE IQEIQS 
RENKRMQVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLAQLES 
QDGDI LRQRDE FDLLVAGEI CRFSALIEELEEKNER PARELLTD 
IRSTL I RCETRKCRKPVAVS PELGQRI RDF PQQALPLQRBMKMF 
LE KLC FE LD YE PAH IS LDPQTS HP KLLLS EDHQRAQFS YKWQNS 
PDNPQRFDRATCVLAHTG I TGGRHTV7WS IDLAHGGSCTVG WS 

edvqrkgelrlrpeegvwavrlawgfvsalgsfpNtrltlkeqp 

RQVRVSLDYEVGWVTFTNAVTREP I YTFTASFTRKVI PFFG LWG 
RGSSFSLSS 


5788 


2 


6860 


ehsvsgrssaygdataeghpagpgsvssstgaistttghqegdg | 

segegegetegdvhtsnrlhmvrlmllerllqtlpqlrnvggvr 

aipymqvilmlttdldgedekdkgaldnllsqliaelgmdkkdv 

SKKNERSALNEVHLWMRLLSVFMSRTKSGSKSSICESSSLISS 
ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 
HTTSSP PDMSPFFLRQYVKGHAADVPEAYTQLLTEMVLRLP YQI 
KKITDTNSRIPPPVFDHSWFYFLSEYLMIQQTPFVRRQVRKLLL 

ficgskekyrqlrdlhtlds \hvrgi kklleeqgi FLRASWTA 

S PQSALQ YDTL 1 S LMEHLKACAEIAAQRTINWQKFC I KDDSVLY 

fllqvsflvdegvspvllqli^calcgskvlralaassgsssas 
sspapvaassgqattqsksstkkskkeekekekdgetsgsqedq 

LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 

iyrnssksqqellldlmwsiwpelpaygrkaaqfvdllgyfslk 
tpqte kkl ke ysqkave 1 lrtqnh 1 ltnh pn sn 1 yntl sgl vef 
dgyylesdpclvcnnpevpfcyiklssikvdtrytttqqwkli 
sshtiskvtvkigdlkrtkmvrtinlyynnrtvqaivelknkpa 
rwhkakkvqltpgqtevkidlplp I vasnlmibfadfyenyqas 

rETLQCPRCSASVPANPGVCGNCGENVYQCHKCRSINYDEKDPF | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, OCysteine, D=Aspartic Acid. E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methion.ine, N=Asparagine, 
P=Proline, 0>Glutamine r R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W^Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 






— 


LCNACGFCKYARFDFHLYAKPCCAVDPI ENEEDRKKAVSNINTL 

LDKADRVYHQLMGHRPQLENLLCKVNEAAPEKPQDDSGTAGGIS 

STS ASVNRYI LQLAQEYCGDCKNS FDELSKI IQKVFAS RKELLE 

YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 

CAS A VTEHC I TLLRALATNPALRHI LVS QG L I RE LFDYNLRRGA 

AAMREEVRQLMCLLTRDNPEATQQMNDLIIGKVSTALKGHWANP 

DLASSLQYEMLLLTDS IS KEDS CWELRLRCALSLFLMAVNI KTP 

VWENITLMCLRILQKLIKPPAPTSKKNKDVPVEALTTVKPYCN 

EIHAQAQLWLKRDPKASYDAWKKCLPIRGIDGNGKAPSKSELRH 

LYLTEKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVLFTPAT 

QAARQAACTIVEALATIPSRKQQVLDLLTSYLDELSIAGECAAE 

YLALYQKilTSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 

TLSTDLQQG Y ALKS LTGb LS S FVE VE S I KRHFKS RLVGTVLNG Y 

LCLRKLWQRTKLIDETQDMLLEMLEDMTTGTESETKAFI4AVCI 

ETAKRYNLDDYRTPVFIFERLCSIIYPEENEVTEFFVTLEKDPQ 

QEDFLQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDLVALLED 

DS GME LLVNN K 1 1 S LD L P VAEVY KKVWCTTNEG E PMR I V YRMRG 

liLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 

RLAGIRDFKQGRMLLTVLLKIiFSYCVKVKVNRQQLVKLEMNTIiN 

VMLGTLNLALVAEQESKDSGGAAVAEQVLSIMEIMCAEPNVEP 

LSEDKGNLLLTGDKDQLVMLLDQINSTFVRSNPSVLCGLliRIIP 

Y LS FG EVE KMC; I LVERF KP YCN FDK YDEDHSGDDKVFL\ DC FCK 

1 AAG I K\NNSNGHQL\ KDL \ I LQKG I TQNALD\ YMKKH I P / SAA 

R I W DAD I \ W KS FCLRPALP F I LRLLRG LA I QH PGTQVL I GTDS I 

PNLHKLEQVS\SDEGIGTLA\ENL\LESLREHPDVNKKIDA\AR 

RETRAEfCKRMAMAMRQKALGTLG \MTTMEKGQWD/TRTALLEA 

DWEEHEEP\GLTCCICREGYKFQPTKVLGIYTFTKRWLGGVW 

ENKPRETSRATSTVSHFNIVHYDC\HLA\AVSLARGREEWESAA 

LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 

TYQLNIHDI KLLFLRFAMEQS FSADTGGGGRBSNIHLI PYI IHT 

GLYVLNTTRATSREEKNLOGFLEQPKEKWVESAFEVDGPYYFTV 

LALH I LP P EQWRATR VE I LRRLLVTS QARA VAPGGATRLTD KAV 

KDYSAYRSSLLFWALVDLIYNMFKKVPTSNTBGGWSCSLAEYIR 

HNDMP I YEAAD KALKTFQEEFMPVETFSEFLDVAGLLSE I TDPE 

SFLKDLLNSVP 


5789 


1 


2407 


LPLHAVEKTGR PGQ PAL KMPGKLRS DAG LES DTAM KKGETLRKQ 
TEEKE KKE KPKS DKTEE I AEEEETVFPKAKQVKKKAE PSE VDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 
VTKNEEPSEEE I DAPKPKKMKKEKEMNGETREKS PKLKNG FPH P 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFPIQAKTFHHVYSGKDLIAQARTGTGKTFSFAIPL 
IEKLHG\ELQDRKRGRAPQVLVLAPTRELANQVSKDFSDITKKL 
S VACFYGGTP YGGQFERMRNGI DILVGTPGR I KDHI QNGKLDLT 
KLNHWLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPHWVFNVAKKYMKSTYEQVDLIGKKTQKTAITVEHLAIKCH 
WTQRAAVIGDVIRVYSGHQGRTI I FCETKKEAQELSQNSAI KQD 
AQSLHGDIPQKOREITLKGFRNGSFGVLVATNVAARGLDIPEVD 
L VIQS S P P KD VE S Y I HRSGRTG RAG RTGVC I C FYQH KE EYQL VQ 
VEQKAGIKFKRIGVPSATEIIKASSKDAIRLLDSVPPTAISHFK 
QSAEKLIEEKGAVEAIAAALAHISGATSVDQRSLINSNVGFVTM 
ILQCSIEMPNISYAWKELKEQLGEEIDSKVKGMVFLKGKLGVCF 
DVPTASVTE IQEKWHDSRRWQLS VATEQPELEGPREGYGG FRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRS FS KAFGQ 


5790 


. 3786 


1585 


ARRQRDPLQALRRRNQELKQQVDS LLSE SQLKEALEPNKR^HI Y 
QRC I QLKQAI DENKNALQICLSKADESAP VANYNQRKEEEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEBESESEDSEDSGGEEB 
DAEEEEBEKEENESHKWSTGEEYIAVGDFTAQQVGDLTFKKGEI 
LLVIEKKPDGWWlAKDAKGNEGLVPRTYLEPYSEEEEGQESSEE 
GSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAI SEAG I FCLVN 
HVSFCYL1VLMRNRMETVEDTNGSETGFRAWNV0SRGRIFLVSK 
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j SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(/WAlanine, C=Cysteine, D= As par tic Acid. E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, X«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=Unkno*m, *=Stop 
Codon, /=possible nucleotide deletion, 
\=posoible nucleotide insertion) 








P VLQQ I N TVD VLTTMGAI PAG F R PSTLS QLLEEGNQ FRAN V FLQ 
PELMPSQLAFRDLMWDATEGTIRSRPSRISLILTLWSCKMIPLP 
GMS IQ VLS RHVRLC LFDGNKVLS NIHTVRATWQP KKP KTWTFS P 
QVTRILPCLLDGDCFIRSNSASPDLGILFELGISYIRNSTGBRG 
ELS CG W VFL KLFDAS G VP I PAKT YELFLNGGTPYE KG I E VDP S I 
S RRAHGS VF YQIMTMRRQPQltLVKliRS LNRRS RNVLSLLPETL I 
GNHCSIHI^IFYRQILGDVLLKDRMSLQSTDLISHPMIiATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PKSFLKVPRFLLVYH 
\GCVLPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQAL LS PDGVHB P FDLS EQTYD FLG EMRKNAV 


5791 


3 


1636 


LRVAEFAGTSR/ IGAGLIQPLHRAPARDHGLjRGGAAPALSVSH " 

GN/GKQWAMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQIiV 

NRS I QQGFC FNI LCVGETG IG KS TL I DTLFNTNFED Y E S S HFCP 

NVKLKAQTYELQESNVOLiCLTIVNTVGFGDQINKEESYQPIVDY 

IDAQFEAYLQEELKIKRSLFTYHDSRIHVCLYFISPTGHSLKTL 

DLLTMKNLDSKVYI I P VI AKADTVSKTELQKFKIKLMS ELVSNG 

VQ I YQFPTDDDTI AKVNAAMNGQLP FAWGSMDEVKVGNKMVKA 

RQYPWGWQVENEKHCDFVKLREMLICTNMEDLREQTHTRHYEL 

YR RCKLEEMGFTD VG PENKP VS VQ ET 5f E AKRHE FHGBRQRKEE E 

MKQMFVQRVKEXEAILKEAERELQAXFEHLKRLHQBBRMKLEEK 

RRLLEEEIIAFSKKKATSEIFHSQSFLATGSNLRKDKDRKNSQF 

FVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICFITSIFGEQPQ 

LLIFMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAPSPAWWCGVFWYWHTCWVMYGIVyTRPCSGDASClQPY 
LARRPKLQL\RHS FTTTRSHLGAENN I DLVLNVEDFDVES KFER 
TVNVS VPKKTRNNGTLYA YI FLHHAG VLPWHDGKQVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDGS SLPADVHR YWKM I QLGKT VHYLPI LF I DQLSN 
RVKDLMVI NR S TTEL PLT VS YDKVS LGRLR FW I HMQDAVYS LQQ 
FG FS EKDAO EVKG I F VDTNL YFLALTFF VAAFHL LFD FLAFKND 
ISFWKKKKSMIGMSTKAVLWRCFSTWIFLPLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTI FWRGLMPEFQFGTYSESERKTEEY 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPSPAWWCGVFVVYVVHTCWVMYG1VYTRPCSGDASCIQPY 
LARRPKLQL\ RHS FTTTRSHLGAENNI DLVLNVEDFDVES KFER 
TVNVS VPKKTRNNGTLYAY I FLHHAG VLPWHDGKQVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDGSSLPADVHRYMKMIQLGKTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
I S FWKKKKSM IGMS TKAVLWRC FST Wl FLFLLD EQTSLLVL VP 
AGVGAAIELWKVKKALKMTI FWRGLMPEFQFGTYSESERKTEEY 
DTQAM K YLS YLL Y PLCVGG A VYS LLNI KYKS W YS WL INSF VNG V 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5794 


1 


5016 


MG P RLS VWLbLL PAALLLH E E H SRAAAKGGCAGSG CG KCDCHG V " 

KGQKGERGLPGLQGVIGFPGMQGPEGPQGPPGQKGDTGEPGLPG 
TKGTRG P PGASG Y PGNPGL PG r ^GODO pprpptt vczrurTvnz-o 

GPLGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKGQM 
GLS FQG PKGDKGDQGVSGPPG VPGQAQVQEKGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPQLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 
3DQGPPGIPGQPGFIGEIGEKGQKGESCLICDIDGYRGPPGPQG 
PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK 
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j SEQ 
ID 

NO: 


1 Predicted 
I beginning 
1 nucleotide 

location 

corresponding 

to first 
1 amino acid 

residue of 
1 amino acid 
j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine # I=Isoleucine, lULysine, 
L=Leucine, M=Methioninc, N~Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFiJi^KGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 

PKGS PGS VG L KG ERG P PGG VG F PGS RG DTG P PG P PG YG PAG PIG 

DKGQAG FPGG PGS PGL PGPKGB PGKI VPLPG P PG AEGLPG S PG F 

PGPQGDRGFPGTPGR\PGL\PGBKGAVG\QPGIGFPGPPGPK5V 

DGLPGDMGPPGTPGRPGFNGLPGNPGVDGQKGEPGVGLPGLKGL 

PGLPGIPGTPGBKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 

LPG S VGS PG VPG I GP PGARG P PGGQG P PGLS G P PG I KG E KGF PG 

FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 

S KGE MG VMGTPGQ PGS PG PWGAPGLPGEKGD \HG FPGS SG PRGD 

PGLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 

EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 

GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 

AGPPGIGIPGLRGEKGDQGIAGPPGSPGEKGEKGSIGIPGMPGS 

PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 

TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 

DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 

GHATEG P KGDRG PQGQPG L PGLPG PMG ? PGLPG I DG VKGD KGN P 

GWPGAPGVPGPKGDPGFQGMPGIGGSPGITGSKGDMGPPGVPGF 

QGPKGLPGU3GIKGDQGDQGVPGAKGLPGPPGPPGPYDIIKGEP 

GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 

PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFL 

VTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAG 

SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 

ITGRNIRPFISRCAVCEAPAHVMAVHSQTIQIPPCPSGWSSLWI 

G YS FVMHTSAGAEGSGQALASPGSCLE E FRSAP F I ECHGRGTCN 

YYANAYSFWLATIERSEMFKXPrPSTLKAGELRTHVSRCQVCMR 


5795 


1192 


61 


STRSPTVEY1SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHE 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDTIFEDYEKLLQSSNYVTKRQSLKLLGELILDRHN 
FAIMTKY I S KPENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS PH 
KTQPIVEILLKNQPKLIEFLSSFQKERTDbEOFADEKNYLIKQI 
RDLKKTAP +RALR DS KR 


5796 | 


2 


1078 


GRVGWELWCMYXSPPKDWWDAGDPSLPIRTPAMIGCS fwnrkf 

fge iglldpgmdvyggeni elg i kvwlcggsme vlpcsrvahi e 
rkkkpynsnigfytkrnalrvaevwmddykshvyiawnlplenp 
gidigdvserralrkslkcknfqwyldhvypemrrynntvayge 
lrnnkakdvcldqgplenhtailypchgwgpqlarytkegflhl 
galgtttllpdtrclvdnsksrlpqlldcdkvxsslykrwnfiq 

NGAIMNKGTGRCLEVENRGLAGIDLILRSCTGQRWTIKN3IK*R 

EGAGALEPGPQDMAAPPNIWTSCPGGETARGRQVIiDGPPRASPG 
QHRDPG 


5797 
~ 5798 


2 


891 


PRVRQKTLVDVTLENSNIKDQIRNLQQTYEASHDKLREKQRQLE ' 

vaqvenqllkmkvessqeanaevmremtkklysqyeeklqeeqr 

KHSAEKEALLEETNSFLKAIEEANKKMQAAEISLEEKDQRIGEL 
DRLIERMEKERHQLQLQLLEHETEMSGELTDSDKERYQQLEEAS 
AS LRE R I RH LNDMVHCQQ KKVKQMVE E I E S LKKKLQQ KQLL I LQ 
LLEKISFLEGENNELQSRLDYLTETQAKTEVETREIGVGCDLLP 
SQTGRTREIVMPSRKYTPYTRVLELTMKKTLT 




644 


115 


KILGSRWKSMSNQEKOPYYEEQARLSKIHLEKYPNYKYKPRPKR " 
TCIVDGKKLRIG3YKQLMRSRRQEMRQFFTVGQQPQIPITTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGSIoAGNEMINGEDEMEMYDDYEDDPKSDYSSENEAPEAVSAN 


5799 


2679 


1435 

4 


LLSTY I KFINLFPETKAT IQGVLRAGSQLRNADVELQQRAVE YL 
TLSSVASTDVXjATVLEEMPPFPERESS ILAKLKRiCKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
EADBLLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVOFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
W I ECLRDFLTP PLLS VR FRYGGA PQALTLKL P VT 1 NKFFQ PTEM 
ftAQDF FQR WKQLSLPQQEAQ K I FKANH PMDAEVTKAKL U3 FG SA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine. D=Asoartir Arid e- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H-Histidine, I-Isoleucine, KoLysine, 
LoLeucinc, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion. 
\=possible nucleotide insertion) 








LLDNVD PN P ENFVGAG 1 1 QTKALQVGCLLRLEPNAQAQM YRLT h 
RTSXEPVSRHLCELLAQQF 


5800 


2679 


1435 


LLSTYI KF INLF P ET KATI QG VLRAGS QLRNADVE LQORAVE Y L 
TLSSVASTDVLATVLESMPPFPERESSILAKZiKRKKGPGAGSAIi 
DDGRRDPSSNDINGGMSPTPSTVSTPSPSADI*LGLRAAP?PAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
EADEIiNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYiiFYGN 
KTS VQFQNFS PIWH PGDLQTQLAVQT KR VAAQVDGG AQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
LLDNVDPNPENFVGAG I IQTKALQVGCLLRLE PNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITSTK TMRUnPQP^'r ^tpt wnpci?n > DT ituttt 
QH I YRDGVI ARDGRLLPGDI I LKVNGMD I SNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRS RNNGQAPDAYR PRDDS FHVI LNKSS PEE 
OLGIKLVRKVDEPflVPT FNVT.nfJfiVa.VTJ'wnftT ppwiidot stmpu 

DLRYGSPESAAHLIQASERRVHLWSRQVRQRSPDIFQEAGWNS 
NGSWSPGPGERSNTPKPLHPTITCHEKWN1QKDPGBSLGMTVA 
GGASHREWDIjPIYVISVEPGGVISRDGRIKl'GDILLNVDGVELT 
BVS RSEAVALLKRTS SS IVLKALEVKEYEPQEDCSS PAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
GYEEYNGNKPFFIKSIVEGTPAYNDGRIRCX3DXLLAVNGRSTSG 
MIHACLARLLKELKGRI TLTIVS WPGTFL 


5802 


3 


290 


GAFFYLISPLDFVPEALFGILGFLDDFFV1FLLLIYISIMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAE I YAYREEQDFGIB IVKVKAIGRQRFKVLELRTQSD 
G I QQAKVQ I LP2CVL P STMS AVQLES LN KCQ I FPSKPVS REDQC 
SYKWWQKYQKRXFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVIjRIQIjLKIGSAIQR 

lrce ld i mnkcts lcckq cqete i ttkne i fs ls lcg pmaa yvn 
phgyvhetltvykacnlnligrpstehswfpgyawtvaqckica 

SHIGWKFTATKXDKSPQKFWGLTRSALLPT1PDTEDEISPDKVI 
LCL • 


5B04 


A 


1707 


EMEKQRQEEQRKRTEiEERKRRIEQDMLEKRkiQRELAKRAEQIE " 

DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 

R EE KER I KYEE D KR I R YEEQR PS LKEAKCLS LVMDDE I ES E AKK 

ESLSPGKLKLTFEEIiERQRQENRKKQAEEEARKRLEEEKRAFEE 

ARRQMVNEDEENQDTAKIFKGYRPGJCLKLSFEEMERQRREDEKR 

KAE EEARRR I E EE KKAFAEARRNM WDDDS PEM YKT IS QE FLTP 

GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQEFEQLRQEM 

GEEEEENETFGliSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 

EKE IQKKIEEERARRRAIDLE I KERE AENFHEEDDVDVRPAR KS 

EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLRMQPEQREI 

DAALQKKREEEEEEEGSIMNGSTAEDEEQTRSGAPWFKKPLKNT 

SWDSEPVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYIERGE 

TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


YISDTLGQVYKSKIRWWIEENGGNGNISVDDLIALLDLAEHASS' 
AFKESQQQSBDREYEVKERLYPKSKRRYDTYNIAGYQGEIEVGL 
YTIQILQLIPFFDNKNELSKRYMVWFVSGSSDIPGDPNNBYKLA 
LKNYI P YLTKLKFSLKKS FDFFDE Y FVLLKPRNNI KQNEBAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGLGSKFSBPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLLKP I HVFFGAAILSLS IAS VISGINEKLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRP 


5807 


22S7 


1302 


RFSKKTFRRPMAVDIQPACLGLYCGKTLLFKNGSTEIYGECGVC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMIjPLVLHWFFIEW 
YSGKKSSSALFQHITALFECSMAAIITLLVSDPVGVLYIRSCRV 
LMLSDWYTMLYNPSPDYVTTVHCTHEAVYPLYTIVFIYYAFCLV 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment: containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q^Glutamine, R-Arginine, 
S«Serine, T=»Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








UWLLRPLLVKKIACGLGKSDRFKSIYAALYFPPILTVLQAVGG^ 
GLLYYAFPYIILVIjSLVTLAVYMSASBIENCYDLLVRKKRLIVL 
FSHWLLHAYGI IS ISRVDKLEQDLPLLALVPTPALFYLFTAKFT 
EPSRILSEGANGH 


5608 


2 


433 


SLPDSGVVEYJ^NGGVADNHKDFGELRYNECLMNFSCNGKNGSS 

EGRITHGFOLKSAYEIJNLMPYTNYTFDFKGVIDYIFYSKTHMNV 

U3VU3PLDPQWLVENNITGCPHPHIPSDHFSLLTQLELHPPLLP 
LVNGVHLPNRR 


5809 


464 


2422 


ILVPGFOGlLHPG\nfCAI^SQHQAQELVADIDECEVSGL(^:GG 
RCV>raGSFECYCMDGYLPRNGPEPFHPTTDATSCTElDCGTPP 
EV P DGY I IGN YTSS LGS QVR YACREGF FS VPE DTVSS CTGLGTW 
ES? KLHCQE INCGNP PEMRHAI LVGNHS SRLGG VAR YVCQEG FE 
S PG G KI TS VCTEKGT WRE STIjTCT E I LTKI ND VSLFNDTCVR WQ 
I NS RRINP K I S YV I S I KGQR LD PM ES VRE ETVN LTTDS RTPE VC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
S I FNETCLKLNRRSRKVGSEKMYQFTVLGQRW YLANFS HATSFN 
FT TREQ VP WCLDLY PTTDYTVNVTLLRSP KRHS VQ I T I ATP PA 
VKQTI SNI SGFNETGLR WRS I KTADMEEM YLFH I WGQ R WYQKE F 
AQEMTFNISSSSRDPEVCLDLRPGTNYNVSLRALSSBLPWISL 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LP LALQS TFS CDSEG AS S FFSNAS D ADG YVAA EL LAKDV PDDAM 
BIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDSS LMLLQMAG VGLGSLAWI ILTFLS FSAV 


5810 


3 


1641 


KVFGTHKDHEVSTLDTAI S AVKVQLAEFLENLQE KSLR I EAFVS 
B I ES FFNT I EENCS KNEKRLEEQNESMMKKVLAQYDE KAQS FEE 
VKKKKME FLHEQMVHFLQSMDTAKDTLET I VREAEELDEAVFLT 
SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQPPRLEPQEPNSATSTTIAVYWSMNKEDVIDSFQVYCME 
EPQDDQEVNELVEEYRLTVKES YCI FEDLEPDRCYQVWVMAVNF 
TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 
TETYTLEYCRQHSPEGEGLRSFSGIKGLQLKVNLQPNDNYFFYV 
RAINAFGTSEQSEAALISTRGTRFLLLRETAHPALIIISSSGTVI 
S FGERRRLTEI PS VLGEELPSCGOHYWETTVTDCPAYRLGICSS 
SAVQAGALGQGETSWYMHCSEPQRYTFFYSG I VSDVHVTERPAR 
VGILLDYNNQRLIFINAESEQLLFIIRHRFNEGVHPAFALEKPG 
KCTLHLG IBP PDS VRHK 


5811 


1918 


851 


AAALADPLPEDKWSAEKRRPLKSSLGYEITFSLLNPDPKSHDVY 

WDIEGAVRRYVQPFTiNALGAAGNFSVDSQILYYAMLGVNPRFDS 

ASSSYYLDMHSLPHVINPVESRLGSSAASLYPVLNFLLYVPELA 

HSPLYIQDKDGAPVATNAFHSPRWGG IM VYNVDS KT YNAS VLPV 

RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 

WELDRLLWARSVENLATATTTLTSLAQLLGKISNIVIKDDVASE 

VYKAVAAVQKSAEELASGHLASAFVASQEAVTSSELAFFDPSLL 

HLLYFPDDQKFAIYIPLFLPMAVPILLSLVKIFLETRKSWRKPE 
KTD 


5812 


5264 


2744 


GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASLEklADPT 
LAEMGKNLKEAVKMLEDSQRRTEEENGKfCLISGDIPGPLQGSGQ 
DMV S I LQL VQNLMHGDEDEE PQS PR I QNIGEQGH MAL LGFS LGA 
YISTLDKEKLRKLTTRILSDTTLWLCRIFRYENGCAYFHREERE 
GLAKICRLAIHSRYEDFWDGFNVLYNKKPVIYLSAAARPGLGQ 
YLCNQLGLPFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDDIERG 
RLPLLLVANAGTAAVGHTbKIGRLKELCEQYGIWLHVEGVNLAT 
LALGYVSSSVLAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 
LTLVAGLTSNKPTDKIiRALPLWLSLQYLGLDGFVERIKHACQLS 
QRLQESLKKVNYIKILVEDELSSPWVFRFFQELPGSDPVFKAV 
PVPNMTPSGVGRERHSCDALNRWLGBQLKQLVPASGLTVMDLEA 
EGTCLRFSPLMTAAVLGTRGEDVDQLVACIESKLPVLCCTLQLR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
GENIHAGLLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVETIAATAREIEDNSRLLENMTEVVRKGIQEAQVELQKAS 
EERLLEEGVLRQIPWGSVLNWFSPVQALQKGRTFNLTAQSLES 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


• 1 Amino acid segment containing signal pepti5e~~ 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I-Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S-Serine, T=Threonine, V-Valine, 
^-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQKPFKRSLRGSDA 
LSt-I-SSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


699 


HR DG VSGS LERPLTDRS RTGAFAQQRGKMATAGGG SGAD PGSRG 

LLRLLSFCVLLAGLCRGNSVERKIYIPLNKTAPCVRLLNATFQI 

GCQSSISGDTGVIHWEKEEDLQWVLTDGPNPPYMVLLESKHFT 

RDLMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 

SNSYGPEFAHCREIQWNSLGNGLAYEDFSFPIFLLEDENETKVI 

KQCYQDHNLSQNGSAPTF PLCAMQL FSHMAWLS FSTAT \ CMRRS 

SIQSTFSINPKIVCDPLSDYNVWSMLKPINTTGTLKPDDRVWA 

ATRLDSRSFFWNV\APGASSAVASFVTQLAAAEALQKAPDVTTL 

PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEL 

GQVALRTSIiELWMHTDPVSQKNESVRNQVEDLLATLEKSGAGVP 

AVI LRRPNQSQPLPPS SLQRFLRARN I SG WLADHSGAFHNKY Y 

QSiyDTAENINVSYPBWLEPLKE/ETWNFG+QDTAKALADVATV 

U5RALYELAGGTNFSDTVQADPQTVTRLLYG\ FLIKANNSWFQS 

ILQGRDLRSYLG*RGLFQH\YIAV\SSPTNTIYV/VLQYALANL 

TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 

RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 

IASKELEIiITLTVGFGILIFSLIVTYCINAKADVLFIAPREPGA 
VSY 


5014 


8500 


432 


ALKCRPRRVLAILVGPVQPDRMAEEGAVAVGVRVRPLNSREESL 
GETAQVYWKTHNNVI YPVDGSKSFNFDRVLHGNETPKNVYEA\ I 
AAP 1 1 DSAIQG YNGTI FA\ YGQT\ ASGKTYTMMGS EDHLG VI PQ 
GQFHGH FSQK I * E VFLDRE FLLR VS YME I YNBTI TDLL CGTQ KM 
KPLIlREDVNRNrVYVADLTEEWYTSEMALKWITKGEKSRHYGE 
TKMNQRSSRSHTIFRMILESREKGEPSNCEGSVKVSHLNLVDLA 
GSERAAQTGAAGVRLKEGCNINRSLFIIiGQVIKKLSDGQVGGFI 
NYRDS KLTR I LQNS LGGNPKTRI I CTI TP VSFDETLTALQFAST 
AKYMKNTP YVNEVSTDEAIiLKR YRKE I MD LKKQLE E VS LE TRAQ 
AMEKDQIiAQLLE E KDLLQK VQN E K I ENLTRML VTS S S LTLQQ3L 
KAKRFCRRVTWCLGKINKMKNSNYADQFNIPTNITTKTHKLSINL 
LREIDESVCSESDVFSNTLDTLSEIEWNPATKLLNQENIESELN 
SLRADYDNLVLDYEQLRTEKEEMELKLKEKNDLDBFEALBRKTK 
KDQEMQLIHEISNLKNLVKHREVYNQDLENELSSKVELLREKED 
QIKKLQEYIDSQKLENIKMDLSYSLESIEDPKQMKQTLFDAETV 
ALDAKRESAFLRSENLELKEKMKELATTYKQMENDIOLYQSQIiE 
AKKKMQVDLEKELQSAFNE ITKLTSLI DGKVPKDLLCNLELEGK 
ITDLQKELNKEVEEWEALRBEVILLSELKSLPSEVERLRKEIQD 
KSEELHIITSEKDKLFSEWHKESRVQGLLEEJGKTKDDLATTQ 
SNYKS TDQE FQN FKTLHMD FEQKYKMVLEENERMNQE I VNLS KE 
AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQbKEQLE 
NRDSPLQTVEREKTLITEKLQQTLEEVKTLTQEKDDLKQLQESL 
QIERDQLKSDIHDTVNMNIDTQEQLRNAIiESIiKQHQETINTLKS 
K I S EE VSRN LHM EENTG ETKDEFX3QKMVG I DKKQDLE AKNTQT L 
TADVKDNEI IEQQRKI FSLIQEKNELOQMLESVIAEKEQLKTDL 
KENIEMTIENQEELRLbGDELKKQQEIVAQEKNHAIKKEGELSR 
TCDRLAEVEEKLKEKSQQLQEKQQQLLMVQEEMS EMQKKINEIE 
NLKNELKNKELTLBHMETERLELAQKLNENYEEVKS I TKERKVL 
KE LQKS FETE RDH LRG Y I RE I EATG LQTKEE LKI AHI HLKEHQE 
TIDELRRSVSEKTAQIINTQDLEKSHTKLOEEIPVLHEEQELLP 
»ftiwiju x v*» a UNEULUDicyo «. i wJo i i JjAKIc.MBRLR1jNEKF 

QESQEEIKSLTKERDNLKTIKEALEVKHDQLKEHIRETLAKIQE 
SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEIEMLG 
LSKRLQESHDEMKSVAKEKDDLQRLQEVLQSESDQLKENIKEIV 
AKHLETE EELKVAHCC LKEQEE T I N ELR VNLS E KETE I S T IQKQ 
LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 
KAKDSALQSIESKMLELTNRLQESQEEIQIMIKEKEEMKRVQBA 
LQIERIX3LKENTKEIVAKMKESQEKEYQFI,KMTAVNETQEKMCE 
I EH LKEQFE TQKLNLEN I ETEN I RLTQI LHENLEEMRS VTKERD 
DLRSVEETLKVERDQLKENLRETITRDLEKQEELKIVHMHLKEH 
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ID 

KO: 


freaiccea 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Meth.ionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSTIDKLROIVSBKTNEISNMQKDLEHSNDALKAQDLKiQEELR 

IAHMHLKEQQETIDKLHGIVSEKTDKLSNMQKDLENSNAKLQEK 

IQELKANEHQLITLKKDVTOBTQKKVSEMEQLKKQIKDQSLTLSK 

LEI ENLNLAQKLHENL3EMKS VMKERDNLRRVEETLKLERDQLK 

ESLQETKARDLEIQQELKTARMLSKEKKETVDKLREKISEKTIQ 

I SD I QKDLDKS KDBLQ KKIQE LQKK ELQLLR VKEDVNMSHKK I N 

EMEQLKKQFEPNYLCKCEMDN FQLTKKLHES LEE I R I VAKER DE 

LRR I KB S LKMERDQF I ATLREM I ARDRQNHQ V KP E KRLLS DGQQ 

HLMESLREKCSRIKELLKRYSEMDDHYEGLNRLSLDLEKEIEFH 

RIM KKXKYVLS Y VTK I KEEQHEC INKFEHDFI DEVE KQKELL I K 

IQHLQQDCDVPSRELRDLKLNQNMDLHIEE ILKDFSES BFPS I K 

TE FQQVLSNRKEMTQ FL E E W LNTR FD IE KLKNG IQKENDR I CQ V 

NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKKEKLF 

KN YQTLKTS LASGAQVNPTTQDNKNPHVTSRATQLTTEKI RELE 

NS LHEAKES AMHKE S K 1 1 KMQKELE VTNDI I AKLQAKVHES N KC 

LEKTKETIQVLQDKVALGAKPYKEEIEDLKMKLGKIDLEKMKNA 

KEFEKEISATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 

DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 

EQLIKQKNELLSMNQHLSNEVKTWKERTLKREAHKQVTCENSPK 

SPKVTGTASKKKQITPSQCECERNLQDPVPKESPKSCFFDSRSKS 

LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 

DVP\ECKTQ 


5315 


23 


1460 


S ELVM WTVQNR ES LGLLS F PVW I TM VCCAKS TN EPSNMS Y VKE T 
VDRLLKG YD I RLRPDFGG P PVDVGMRI DVAS IDMVSEVNMD YTL 
TM YFQQS WKD KRLSYSG I PLNLTLDNRVADQLWVPDTYFLNDKK 
S F VHGVTVKNRM I RLH P DGTVL YGL R I TTTAACMMDLRR Y PLDE 
QNCTLE IESYGYTTDDIE F YWNGG EG A VTG VN K I E LPQ FS I VDY 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
SWVS FW I NYDAS AARVALG I TTVLTMTT I S THLRETLPKI P Y VK 
AID I YLMGCFVFVFLALLE YAFVNY I F FGKGPQKKGAS KQDQSA 
NEKNKLEMNKVQ VDAHGN I LLSTLE I RNETSGSE VLTS VSDPKA 
TMYSYDSASIQYRKPLSSRE\A*GRAPDRHGVPSJCGRIRRRAS\ 
QLKVKI PDLTDVNS IDKWSRMFFPI TFSLFNWYWLYYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADBICKRLAPDSRIj 
NPHRSLLGTGNYDVNVIMAALQGIX3IAAVWWDRRRPL5QLALPQ 
VLGLI LNLPS PVSLGLLS LPLRRRHLRWPCARL/ VTVS YYNLDS 
K\ LRAP EGPGGLRTE \ * G P FLAAALAQGLCEVLL WT KE VE EKG 
SWLRTD j 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 
VMSNTTVPNAPQANSpSMVGYVLGPFFLITLVGVWAWMYVQK 
KKRVDRLRHHLLPMYSYDPAEELHEAEQELLSDMGDPKW\QAG 
RVATST5GCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLLPLLSP 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHLCFL 


5818 


3 


3918 


QALR DKIMIFLVQS F YA VR HTE S WKLMS TDDQQ KI QAAAFDKGD 
DRRLGKKPIFSSSQQRKQVSDSGDIKIKSWRGNNKKECWSYLST 
NKKMKS DGLGASGHSSSTNRNS INKTLKQDDVKEKDGTKI AS KI 
TKELKTGGKNVSGKPKTVTKSKTENGDKARLENMSPRQVVERSA 
TAAAAATGQKNLLNGKG VRNQEGQ I SG AR P KVLTGNLNVQAKAX 
PLKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKLSFCDSPGQMMKNSVDSVKNSTVAIKSRPVSRVT 
NGTSNKKSIHEQDTNVNNSVLKKVSGKGCSEPVPQAILKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSVKSSVSSRQSDENV 
AKLDHNTTTEKQA P KRKMVTCQ VHTALP KVNAKIVAM PXNLNQS K 
KGETLNNKDS KQKM P PGQVISKTQPSSQR PLKHETSTVQKSMFII 
DVRDNNNKDS VSEQKPHKPLI NLASEIS DAEALQSS CRP\ DPQK 
PLNDQEKE KLALECQNI S KLDKS LKHELE S KQ I CLDKS ETKFPN 
HKETDDCDAANI CCHSVGSDNVNS KF YS TTALKYMVSNPNENSL 
NSNPVCDLDSTSAGQIHLISDRENQVGRKDTNKQSSIKCVEDVS 
LCNPERTNGTLNSAQEDKKS KVPVEGLTI PS KLSDKS AMDEDKH 
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SEQ 
ID 

NO: 


~ Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 

nucleotide 
1 location 

corresponding 

to first 
1 amino acid 
1 residue of 

amino acid 

sequence 


•rmuLiju jliu -eyuieiiL Curtaining signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSD VS s ktJb'SGQLS EKNS PKNM ETS ESPESHBTPETP FVGH 
WNLSTG VLHQRES PES DTGSATTS G DD I KPRS ED YDAGGSQDDD 
GSNDRGISKCGTMLCHDFLGRSSSDTSTPEELKIYDSNLRIEVK 
MKKQSSNDLFQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 

«<* "V 1 -" * v ooOni/C<l oL'DK.oCiMiLIv V AfciN fi I SNPAPQQFO 

GIINLAFEDATENECRBFSANKKFKRSVLLSVDECEBLGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGWSVCKNESTVIJ)I^SIDSSRJCNKQSVSATEKKm'IDVL 
SSRSRQLLREDKKVNNGSNVENDIQQRSKFLDSDVKSQERPCHL 
DLHQREPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 
ANIALSAGDIDDCOTLAQTRMyDHRPSKTLSPIYEMDVIEAFEQ 
KVES2rHVTDMDF*DDQHFAKQDWTLLKQLLSEQDSNLDVTNSV 
PEDLSLAQYLINQTLLLARDSSKPQGITHIDTLNRWSELTSPLD 
S S AS I TMAS FS S EDCS PQG EWT I LELETQH 


5819 


1 


! 5557 


AAAGLLGAbHLVMTLVVAAARAEKEAFVQSESIIBVLRFDDGGL 

LQTETTLGLSSYQQKSISLYRGNCRPIRFEPPMLDFHEQPVGMP 

KME K VYLHN PS S E * T I TL VS I F ATTS HFHAS FFQNR K I L PGGNT 

SFDVS/VFLARWGNVENTLFINTSNHGVFTY\QVFGVGVPNPY 

RLRPFLGARVTVNSSFSPIINIHNPHSEPLQWEMYSSGGDLHL 

ELPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 

TNASDSTEFIILPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 

LNLHLIjNSGTKDVPITSVRPTPQ\NDAITVHFKPITLKAS\ESK 

YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKLEIPYQAEV 

LDG YLGFDHAATLFHI RDS PADPVERPI YLTNTFS FAIL I HDVL 

LPEEAKTMFKVHNFSKPVLILPNESGYIFTLLFMPSTSSMHIDN 

NIIiLITNASKFHLPVRVYTGFLDYFVLPPKlEERFIDFGVLSAT 

EASNILFAI INSNPI ELAI KSWH I IGDG\LS 3 ELVAVDRGNRTT 

1 1 S5LPECEKSS SSDQS S VTLASGYF \AVFRVKLTAKKL \ EGIH 

DGAIQITTDYE1LTIPVK\AVIAVGSLTCSPKHWLPPSFPGKI 

VHQSLN I MNS F S QKVK IQQ I RSLS BD VR FYYKRLRGNKEDLE PG 

KKS K I AN I Y FD PGLOCGDHCYVGL PFLS KS E P KVQ PG VAMQ EDM 

WDADWDLHQSLFKGWTGI KENSGHRLSAI FEVNTDLQKNI ISKI 

TABLSWPSILSSPRHLKFPLTNTNCSS\EEEITLENP/SQDVPV 

YVQFIPLALYSNPSVFVDKLVSRFKLSKVAKIDLRTLEFQVFRN 

SAHPLQSSTGFMEG\LSPHLILNLILKPGEKKSVKVK\FTPVHN 

RTVSSLI I VRNNLTVMDAVMVQGQGTTENLRVAGKLPGPGSSLR 

FKITEAHiKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIE 

ISG YS CEG YGFKWNCQEFTLSANASRD III LFTPDFTASRVI R 

ELKFITTSGSEFVFILNASLPYHMLATCAEALPRPNWELALYI I 

ISGIMSALFLLVIGTA\YLEAQGIWBP\FRRRLS\FEASNPPFD 

VGR P FDLRR I VG I S S EGNLNTLS CD PGHS RGPCGAGGSSS R PSA 

GSHKQ*GPSGHPHSSHSNRNSADVDDVRAYNSGRTSSMTSAQAA 

SSQPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 

PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 

SSEDSDITSLIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGKG 

KPLGRKVKPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 

TSNPDTEPLLKEDTEKQKGKQAI4PEKKESEMSQVKQKSKKLLNI 

KKEIPTDVKPS^IT.FT.DVTDDT.PCVrYDDXTT nnvrnr nmiiumnnn,, 
*v*vcjxr'xuvj\.rt>cjjj£.ljk'l l rrlitoiVyKKNliPSKIPIiPTAMTSGSK 

S RNAQ KTKGTSKLVDNRP P ALAKFLPNS QELGNTS S S EGE KDS P 
PPEWDSVPVHKPGSSTDSLYKLSLQTLNADrFIiKQRQTSPTPAS 
PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 
SLPGKKGNPTFAAVTAGYDKS PGGNGFAKVSSNKTGFSSS LG I S 
HAPVDSDGSDSSGLWSPVSNPSSPDFTPLNSFSAFGNSFNLTGE 
VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 
SGSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 
TLASIGLMGTENSPAPHAPSTSSPADDLGQTYNPWRIWSPTIGR 
RSSDPWSNSHFPHEN 


5820 


310 


1270 


R VSLSG P VS LG VLIiCARSSTMGKRDNR VAYMNP I AMARS RGP I Q 
SSGPTIQ\VI*IDQGLPGKK*KSN*KRKRK/DSKALAEFEEKMN 
ENKKKELEKHREKLLSGSESSSKKRQRKKKEKKKSW+\DSSSS\ 
5SSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
K=Histidine, I = Isoleucine K-Lvsln** 
L-Leucine, MsMethionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion] 








DSKDSLKKKKKSKDGTBKEKDIKGLSKKRKMYSEDKPLSSESLS 
ESEYIEEVRAKXKKSSEEREKATEKTKKKKKHKKHSKKECKKKAA 
SSSPDSP*H*EKSGFPYKESAMSEEISTVKTTTYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5821 


179 


915 


KWRNQSWRW^KPGTNWMI^CSVCWRRVTWTGSVWMRKLGKHPQT 
PT/IKDCSIAATGKRPSARFPHQRRKKRREMDDGLAEGGPQRSN 
TYVIKLFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 
S PLP PLPEDEEG\ SEVTNS KSR * CVQACPPTHTPGGQPKNACR\ 
SRIPSPLAALW4QGTP*RWSPFEPEPSPSTLIYRNMQRWKRIR0 
RWKEASHRNQLRYSESMKILREMYERO 


5822 


464 


4379 


QTLKEMPIVMARDLEETASSSEDKEVISQEDHPCIMWTGGCRRl 

PVLVFHADA J LTKDNN1 R VIG ERYHLS YKI VRTDSRLVRS I LTA 

HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 

LTR KD R L YKN 1 1 RMQHTHG F KAFH J bPQTFLL PAE YAE FCNS YS 

iCDRGPWrVKPVASSRGRG\VYLINNPNQISLEENILVSRYINNP 

LLIDDFKFDVRJLYVLVTSYDPLVIYLYEEGIiARFATVRYDQGAK 

NIRNQFMHLTNYSVNKKSGDYVSCDDPEVEDYGNKWSMSAMLRY 

LKQBGRDTTALMAHVEDLIIKTIISAELAIATACKTFVPHRSSC 

FELYGFDVLIDSTLKPWLLEVNLSPSLACDAPLDLKIKASMISD 

MFTWGFVCQDPAQRASTRP1YPTFESSRRNPFQKPQRCRPLSA 

SDABMKNLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKEENDR 

RGGF1RIFPTSETWEIYGSYLEHKTSMNYMIATRLFQDRMTADG 

APELKI * S LNS KAKLHAALYERKLLS LEVRKRRRRSSRLRAMRP 

KYPVITQPAEMNVKTETBSEEEEEVALDNEDEEQEASQEHSAGF 

LRENQAKYTPSLTALVENTPKENSMKVREWNKKGGHCCKLETQE 

LEPKFNLMQILQDNGNLSKMQARIAFSAYLQHVQI\RLMKDSGG 

QTFSASWAAKEDEQMEbWRFLKRASNNLQHSLRMVbPSRRLAL 

LERTRI LAHQLGD F 1 1 V YNKETEQMAEKKSKKKVEEEEEDGVNM 

ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN 

viiouoK»i\p>\^utiifttl iMbiiVKIKPPKQQQTTEIHSDKLSRFTTSA 

E KEAKLVYSNS S SG PTATLQKI PNTH LS S VTTS DLS PG P CHHS S 

LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGbP 

RCRSGSHTIGPFSSFQSAAHIY5QKLSRPSSAKAGSCYLNKHHS 

G I AKTQKEGEDASLYS KRYNQSMVTAELQRLAE KQAARQYS PSS 

HINLLTQQVTNLNLATGIINRSSASAPPTLRPIISPSGPTWSTQ 

SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 

KYHPTAGS YQLQFALQQLEQQKLQSRQLLDQSRARHQAI FGSQT 

LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 

VPKPPPNHEQVLRRATSQKASKGSSAEGQLNGLQSSLNPAAFVP 

ITS STD PAHTKI MNHKHTEKQ P VHHS W VHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWSPLAGEKFVEVYKEAHLLALHIESSSRNQAAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLLG P P VG E PRLLAS S PALPS SGAQARLTRAPG PPHS AHAL P 
RESCTAHAASQAATQRKPGTKLLLPRAAS VRGRGI PGAAEKPKK 
EIPASPSRTKIPABKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVPXNKLGLKKTLLKAPGSYSNVTtORIcqq^GaWuciraccn 
CTPQP VAKAKS SEFAS I PAN * LPGLCPN ISKS \GRMG PAMLRPA 
L\PAGPVG\ASSWQAKRVDVSELAAEQLTAPP\SASPTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RS SG PAPQSLLSAWRVSALPT PAS RRCSGLPPMTPKTMPRA VGS 
PL\ C VPARRRS SE PR KNS AMRTE P TR ESNR KTDS R \ L VD VS PDR 
GS PPSRVPQALNFS PEES DSTFS KSTATB VAREEAKPGGDAAPS 
BAIiLVDIKLEPLAVTPDAASQPLIDLPLIDFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPI<IQI.«SPEADK 
ENVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDBPSACRAGDVNMDDPKXEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TS ES P FAWS P LAGEKFVEVYKEAH LLAL-H 1 ES S SRNQAAQAAKP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine t N-Asparagine, 
P=Proline # Q«*Glut amine, R=Arginine, 
S=Scrine. T=Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRUASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAAS VRGRGI PGAAEKPKX 

EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAI P VP \ NKLGLKKTLLKAPGS Y <5 w\ T.O J? V Q <3 cr a \ \rw c-r> * oo * 

CT PQPVAKAKSS E FAS I PAN * LPGLCPNI S KS \GRMGPAMLRPA 
L\ PAGF VG \ ASS WQA KR VD VS E LAAEQLTA P P \SAS PTQPQTP E 
GGG \ QWLNS S CAWS ES SQLNKTRS 1RRRDS CLNS KTKVMPT PTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RS SG P A PQS LLS AWR VS AL PT P AS RR CS GLPPMT PKTM PRAVG S 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDI KLEPLAVTPDAASQPLI DLPL IDFCDTPEAHVAVGS E 

SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5825 


2 


4210 " 


FLQI ES AS PAPFSSG FIAAH PHS PGGSLATKGRSRLSAPGMLHL 

SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGAIjERSFVEIi 

SGAERER PRHFREFTVCS IGTANA VAGA VKYS ESAGG F Y YVE SG 

KLFSVTRNRFlHWKTSGDTLELMEESLDINIiLNNAIRLKFQNCS 

VLPGG VYVSETQNRVI I LMLTNQTVHRLLLPHPSRMYRSELWD 

SQMQS I FTD I GXVDFTDPCNYQL I PA VPG I S PNSTASTAWLSSD 

GEALFAIiPCASGGI FVLKLPPYDI PGMVSWELKQSS VMQRLLT 

GWMPTAIRGDQSPSDRPLSLAVHCVEHDAFIFALCQDHKLRMWS 

YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 

GI F \MHAPKRGQFCI FQLVSTESNRYSLDHI S SLFTSQETLIDF 

ALTSTDIWALWHDAENQTWKYINFEHNVAGQWNPVFMQPLPEE 

EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 

DLSWSELKKEVTLAVENELQGSVTEYBFSQEEFRJNLQQEFWCKF 

YACCLQYQEALSHPLALHLNPHTNMVCLLKKGYLSFLIPSSLVD 

HIiYLLPYENLIiTEDETTISDDVDIARDVICLIKCLRLlEESVTV 

DMSVIMEMSCYNLQSPEKAAEQILEDMITIDVENVMEDICSKLQ 

EIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 

GSNTAGYIVCRGVHKIASTRFtilCRDLLILQQLLMRLGDAVIMG 

TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDT1.ESN 

LQHLSVLELTDSGALMANRFVSSPQTI VELFFQEVARKH IISHL 

FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 

GNCQYVQL0DYIOLLHPWCOVWVR«3PBPMTr , Dr , vT \rmT?r*r\vnT 

ECFCQAAS EVG KEE FLDR L I R SEDGE I VST PRLQY YDKVLRL LD 
VIGLPEL VIQLATSAI TEAS DDW\ KS QATL\ RTCI FKHHlA DLG 
\HNSQAYGSL* PQI PDSSRQLDCLRQLVWLCERSQLQDLVEFS 
YVNLHNEWGIIESRARAVDLMTHNYYELLYAFHIYRHNYRKAG 
TVMFBYGMRLGREVRTLRGLEKQGNCYLAALNCIiRLIRPEYAWI 
VQPVSGAV YDRPGASPKRNHDG ECTAAPTNRQI E I LELEDLEKE 
CSLAR I R LTLAQHD ?S AVAVAG S S S AEEM VTLL VQAGL FDTA r S 
LCQTFKLPLTPVFEGLAFKCIKLQFGGEAAQAEAWAWLAANQLS 
SVITTKESSATDEAWRLLSTYLERYKVQNNLYHHCVIt3KLLSHG 
VPLPJWLINSYKKVDAAELLRLYLNYDLLDLTPYQVrRlCGC 


5826 


3 


871 


KSQLLRDHSAPPPKPCTSVGAMGC+PRQ/SPKEQQRQLKKQKNR 
AAAQRS RQKHTDKAD ALHQQHE S LEKDNLALRKE I QS LQAE LAW 
WSRTLHVHERiCPMDCASCSAPGLLGCWDQAEGLLGPGPQGQHG ' 
CREQIiELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 
AWAE P P VQLSPS PLLFASHTGS S LQGS S S KLS ALQ PS LTAQTA 
PPQPLBLEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPSPHPLLAFPLLSSAQVHF 


5827 


194 


2287 


GMGS ENS ALXS YTLRE P P FTL PSGLAVY P AVLQDG KFAS V F VYK 
RENEDKVNKAAKVP* *HLKTLRHPCLLRFLSCTVEADGlHLVTE 
RVQ P LE VALETbS S AE VCAG I YD I LLAL I FLHDRGHLTHNN VCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQSIRDPASIPP 
E EMS PE FTTL PECHGHAR DA FS FGTLVES LLTI LNEQVS ADVLS 
SFQQTLHSTLLNP I PKWRPALCTLLSHDF FRNDFLE WNFLKS L 
TLKSBEEKTEFFKFIiDRVSCLSEELIASRLVPLLLNQLVFAEP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, QoGlutamine, R^Arginine, 
SaSerine, T= Threonine, V=Valine, 
W-Tryptophan, Y«-Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\KSFLPYLLGPKKDHAQGETPCLLSPALFX)SRVIP\/LLQLF 
EVHEEHVRMVLLSHIhJAYVGALSLjREQLKKV\IL\PQVLLG\LR 
D\TSDSIVAITLHSLAVLVSLLGPEVVVGGERTKIFKRTAP\SF 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNl\QIWP\REP\CDDVKSQCrTLDV 
BESSWDDCEPSSLDTKVNPGGGITATKPVTSGEQKPIPALLSLT 
EESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSBb 
GLGEEFTIQVKKKPVKDPEMDWFADMI PEIKPSAAFLI LPELRT 

EMVPKKDDVSPWKJFSSKFAAAEITEGEAEGWEEEGELNWEDNN 
W 


5828 


2 


257 


AREGGSLGAVAACGELSySCDFCPARPHTSWLTRFVKMEFQAW 
MAVGGGSRMTDLTSS I PKPLLPVGNKFLI WYPLNLLERVGFEEV 
IVVTTRDVQKALCAEFKMKMKPDIVCIPDDADMGTADSLRYIYP 
KLKTDVLVLS CDI«I TDVALHEWDLFRAYDASLAMLMRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GSILQKHPRIRFHTGLVDAHLYCLKKYIVDFLMENG\SITSIRS 
BL\ I PYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRIS Y 
SFY*KEANYTGTGAPY\D\ACWI 


5B29 


260 


1259 


PDGRI.IVSCSEDFCTIKIWDTTNKQCVNNFSDSVGFANFVDFNPS 
GTC I AS AGS DQTVKVWD VRVNKL LQHYQVHSGGVNC I S FH PSGN 
YL ITASSDGTLKILDLLKGRL I YTLQGHTGPVFTVSFS KGGELF 
ASGGADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSPPHLLDIY 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR*SICRSLLPLLWISF 
LLI LPQQQKP WGLCQTR VKRPVDIS *TLP * CHQNVCQQPRKRK 
QKT* VTSPVKVK/ VS I PLAVTDALEHIMEQLNVLTQTVS I LEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 


5830 


4496 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
NI EAAVQDRLNEQEGVPS VFNP P PS RPLQVNTADHRI YS YWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDIVSFMHSFEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
LVYLHGDDHQDSDEFCRNTLCAPEVISLINTRMLFWACSTNKPE 
GYRVSQALRENTYPFIAMIMLKDRRE * PV\ VGRLEGLI \QPDDL 
INQLTF IMDANQT YLVS ERLEREERNQTQVLRQQQDEAYLAS LR 
ADQEKERKKREBRERKRRKKEEVQQQKLABERRRQNLQEEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTIiQE\A 
GLS HTEVLFVQDLTDE 


*831 


71 


2897 


FCSKDKCCLYLPDSINRSKSCrAKPGAHSQDRHAVMDSERQVKD 
TDD I ESP KRS I RDSGYI DCWDSERSDS LSP PRHGRDDS FDS LDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KS WS TATS PAGLGKKALQD YGPRT\ P VS \ DDAES TSMFDMRC3E 

EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDLI KKEEERKKMEKLLAGEDGTSERRKSIKTYRE IVQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKtLE 
RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKPTATVETTIARAS 
VLDTSMSAGSGSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVSVNGETVHREBEKERECPTVAPAHSLTKSQMFEGVARVH 
GSPLELKQDNGS I EINI KKPNS VPQELAATTEKTEPNSQEDKND 
GGKS R KGNI ELAS S E PQH FTTT VTRCS PTVAFVE FPS S PQL KND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 

TC MPS' IVNY*lT .If T .DMT MCnWnCDCP PVCTnrrntnDvxfm • «**«wnMF.___ 

!\TJirn/UNUijHLjFfiJjWbUvUbP&bfcKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEB\WEKAQKEVEEEERRY 
YEEEP ♦ 1 1 \ EDP WPFTVSSSSADQLSTSSSMTEGSGTMNK1 DL 
GNCQDEKQDRRWK KS FQGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNPVSKGVHEDHOLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSSBDVKPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKGAAMIIETLNLYFHIQCFRCG\ICKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRSRSAGQPTTL 


5832 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN 
SENLEKLEKLGKSSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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ID 
NO: 


Predicted 
beginning 
nucleotide, 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~1 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, | 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, j 
S=Serine, ^Threonine, V=Valine, j 
W-Tryptophan, Y=Tyxosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) | 








VPQGTSER KDS S GS V S PN TLS QE EGDQ I ClrYH I R KSCS FQD KC H 1 
R VHFHLP Y RWQ fr'LDRG KW EDLDNM ELI EE AYCNPKI E R I LCSE S 
ASTPHSHCI^FNAMTYGATQARRLSTASSVTKPPHFILTTDWIW 
YWSDE FGS V7QE YGRQGTVHP VTTVS S SD VE KAYLAY / W YTG V* R 
PGSHLEVPGRKAQLRVR FQS LRS EKPGLWHN * KGLPQTQ I R \ A P 
QDVTTMQTCNTXFFG P KS I PD YWDS S ALP DPGFQKI TLS S S S EE 
YQKVWNLFNRTLPFYFVQKIERVQNLALWEVYQWQKGQMQKQNG 
GKAVDERQLFHGTSAI FVDA I CQQNFDWR VCGVHGTS YG KGSYF 
ARDAAYSHHYSXSDTQTHTMFLARVLVGE FVRGNAS FVRP PAKE 
GWSNAFYDSCVNSVSDPSIFV1 FEKHQVYPEYVIQYTTSSKPSV 
TPS I LLALGS LFSSRQ 


5833 


170 


3289 


SILCLLSPCWQFGKPWSILSSRSRHSPCTKKGWEGMRKHLHT 
RQGHK* VHVE I S KAL W VYRDDY F I RH S I S VS AVI VRAW I THKYR 
GRDWNVKWEENLLHAVAKNYTLLQTI P P FER PFKDHQ VCLEWNM 
GYlWNLRANRIPQCPLElfDWALLGFPYASSGENTGIVKKFPRF 
RNRELEATR RQRMD Y P V FT VSL W LY L LHYCKANLCG I LY FVDSN 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
D I S FNGGQI WTTS IGQDLKS YHNQT I SFREDFH YNDTAG YFI I 
GGSRYVAGIEGFFGPLKYYRLRSLHPAQIFNPLLEKQLAEQIfCL 
YYERCAEVQEIVSVYASAAKHGGERQEACHLKNSYLDLQRRYGR 
PSMCRAFPWEKELKDKH PS LFQALLEMDLLTVPRNQNESVS EIG 
GKI FE K.ZWKRLSS IDGLHQISS I VPFLTDS S CCGYHKAS YYLAV 
FYETGLNVPRDQLQGMLYSLVGGOGSERLS SMNLGYKHYQGI DN 
YPLDWELS YAYYSNIATKTPLDQHTLQGDQAYVETIR T iKDDE I L 
KVQTK E EX3DVFM WLKHE ATRGNAAAQQRLAQML FWGQQG VAKNP 
EAAI EWYAKGALETEDPALI YD YAI VLFKGQGVKKNRRLALELM 
KKAASKGLKQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE\MGN 
PDAS YNLGVLH LDG I F PG VPGRNQTLAGEY FHKAAQGGHMEGTL 
WCS LY Y I TGNLET FPRD P E KAWWAKH VAE KNG YLGHV IRKGLN 
AYLEGSWHEALLYYVLAAETGIEVSQTNLAHICEERPDLARRYL 
GVNCVWRYYNFSVFQIDAPSFAYLKMGDLYYYGHQNQSQDLELS 
VQMYAQAALDGDSQGFFNLALLI EEGTI IPHHILDFLEIDSTLH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SAL2 YFLGTFLLS I LIAWTVQYFQS VSASDPPPRPSQASPDTAT 
STASPAVTPAADAS DQDQ PT VTNNPEPRG j 


5B34 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/'QECG \ 

SAAPGPIPGQSSS*VPLRLEOIQQKADCPLSLELALKPRMAAQV 

TLBDALSNVDL LEE LPLPDQQ PC I E P P P SS LLYQPNFNTN FEDR 

NAFVTG IARYI EQATVHSSMNEMLEEGQEYAVMLYTWRSCSRAI 

PQVKCNEQP1^VEIYEKTVEVI*EPEVTKLMNFMYFQRNAIERFC 

GEVRRLCHAERRKDFVSEAYLITLGKFINMFAVLDELKNMKCSV 

KNDHSAYKRAAQFLRKMADPQSIQESQNLSMFLANHNKITQSLQ 

QQL BV I SG YEB LLAD I VNLCVDY YENRM YLTFS E KHMLLKVMG F 

GLYLMDGS VSN I YKLDAKKR INLS K I D KYFKQLQ WPLFGDMQ I 

ELARYIKTSAHYBENKSRWTCTSSGSSPQYNICEQMIQIREDHM 

RFISEIiARYSNSEVVTGSGRQEAQKTDAEYRKLFDLAIjQGLQLL 

SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 

E EKP ALVE V I AM I KGLQVLMGRMES VFNHAI RHT VYAALQD FS Q 

VTIjMEPLRQAIKXKKNVIQSVIjQAIRKTVCDWETGHEPFNDPAL 

rgekdpksg*dikvprravgpsstqlymvrtmlesliadksgsk 
ktlrs slegpt i ld i ekfhres ff ythl infsetlqqccdlsql 
wfreffleltmgrriqfpiemsmpwiltdhiletkeasmmeyvl 
ysldlyitosahyaltrfnkqflydeieaevnlcfdqfvykladq 
ifayykvmagsllldkrlrsecknqgatihlppsnrybtllkor 
hvqllgrs idlnrlitqrvsaamykslelaigrfesedlts i ve 
ldglle inrmthklls ryltldgfdamfreanhnvsapygr itl 
hvfwelnydflpnycyngstnrfvrtvlpfsqefqrdkqpnaqp 

QYLHGSKALNLAYSSIYGSYRNFVGPPHFQVICRLLGYQGIAVV 
MEELLKWKSLLQGTI LQYVKTLME VM PKICRLPRHEYGSPGI L 
EFFHHQLKDIVEYAELKTVCFQNLREVGNAILFCLLIEQSLSLE 1 
EVCDLLHAAPFQNILPRVHVKEGERLDAKMKRLESKYAPLHLVP j 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
tofirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leu cine, M-Methionine, N-Asparagine, 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknovn, *=Stop 
Oodon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIERI/STPgQIAIAREGDLLTKERl^CGLSMFEVILtRlRSFLD 
DP I WRGPLP S NG VMHVDE CVE FHRLWSAMQ FVYC I P VGTHE FTV 
EQCFGDGLHWAGCMIIVLLGQQRRFAVLDFCYHLLKVQKHDGKD 
EIIKNVPLKKMVERIRKFQILNDEIITILDKYLKSGDGEGTPVE 
HVRCFQPPIHQSLASS 


5835 


4203 


1904 


SGN I RMAQGSHQ I D FQ VLHDLRQKFPE V PEVWS RCMLQNNNN L 
DACCAVLSQBSTRYLYGEGDLNFSDDSGISGLRNHMTSLN'LDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNPItWTLAPNIQTGRNTPTSLHIHGVPPPVLNSPQGNSIYI 
RPY ITT PGGTTRQTQQHS GW VSQFNPMN PQQVYQ PSQ PG P WTTC 
PASKPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQSSAHSQYNIQNISTGPRKNQIEIKLEPPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVYIAASPPNTDELMSRSQPKVYISA 
NAATGDEQVMRNQPTLFISTNSGASAASRNMSGQVSMGPAFIHH 
HPPKSRAIGNNSATSPRWVTQPNT\EYTFKITVSPNKPPAVSP 
GWSPTFELTNLLNHPDHYVETENIHHLTDPTLAHVDRISETRK 
LSMGSDDAAYTQDI *RISNS WLGMVAHACNSSALGGQDGRII *A 
QEFETSWGNIWRLRLYRRF*NYAGMVAHTCSPSYSVD*AIjLVHQ 
KARMERLQRELBIQKKKLDKLKSEVNEMENNLTRRJUiKRSNSIS 
Q I PS LE EMQQLRS CNRQLQI DI DCLTKE I DL FQARG PH FNPS AI 
HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICCSVNFSAEriF'sObLKEDIiLYNLKQRGPNSSKQLLK ' 
SDVNYQCLFSAHVLHLRGVLTTQPVEDERGNVFLWNGEIFSGIK 
V2AEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
HYLWFGRDFFGRRSiiLWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VPAS \DFS ELILSLLS FPDALFYNC I LGNI FLGR ILLKKMLI A* 
VKFQQTYQHLYQR*QMKPNCILKNLLFL*I+CCHKLHWRLIAVI 
FPMCHLQERYFKS FLLMYT * KEVIQQFI DVLSVAVKKRVLCLPR 
DENLTANE VLKTCDRKANVAI LFSGG I DSMVI ATIADRHI PLDE 
P IDLLrfVAFIAE BKTMPTTFNREGNKQ KNKCEI P S EE FS KDVAA 
AAADSPNKHVSVPDRITGRAGLKELQAVSPSRIWNFVEINVSME 
ELQKLRRTRI CHL I R PLDTVLDDS IGCAVWFASRG I GWLVAQEG 
VKSYQSNAKWLTGIGADEQIiAGYSRHRVRFQSHGLEGLNKEIM 
MELGR I S S RNLGR DDR VIGDHG KE ARFP FLDENWS FLNSL P I W 
E KANLTLPRG IG EKLLLRLAAVE IX3LTAS ALLPKRAMQ FGSR I A 
KMEKINEKASDKCGRLQIMSLENLSIBKETKL 


5837 


4792 


903 


NGNAVAQAP VTNCCYLATGSKDQTIRI WS CSRGRGVM I LKLP FL 

KRRGGG I DPTVKERLWLTLHWPSNQPTQLVSS CFGGELLQWDLT 

QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 

RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGSLAIGVGDG 

MIRVWNTLS IKNNY DVKNFWQGVKSKVTALCWHPTKEGCLAFGT 

DDGKVGLYDTYSNKPPQISSTYHKFCTVYTLAWGPPVPPMSIjGGE 

GDRPSLAL YS CGGEG I VLQHNPWKLSGEAFDIMKLI RDTNS I KY 

KLPVHTEI SWKADGKIMALGNEDGSIEI FQ\ I PNLKLICTIQQH 

HKLVNTISWHHE\HGSPAQKLSYL\MPSGSQQCSPFTCKNLKNC 

P * KAAPBS PSDPLQS P YRTPPQGHTAQDYPVWAWEPHIH * WEGL 

V FCFP I DG YS P GCW D \ AFPGKEAP VAI FRG \ HQGRLLCVAWS PL 

DPDCIYSG\ADDFCVHKWLTSMQDHSRPPQGXKSIBLEKKRLSQ 

PKAKPKKKKKPTLRTPVKtiESIDGNEEESMKENSGPVENGVSDQ 

EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 

ILLKKEP P KEKPETL IKKRKARS LLPLSTSLDHRSKEELHQDCL 

VLATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMIDISGKG 

HLENGHPELFHQLMLMKGDLKGVLQTAAERGELTDNLVAMAPAA 

GYHVWLWAVEAFAKQLCFQDQYVKAASHLLSIHKVYEAVEIiLKS 

NH FYREA I AI AKARLRP 3DP VL KDLYLS WGTVLERDG H YAVAAK 

CYLGATCAYDAAKVLAKKGDAAS LRTAAELAAI VGEDELSASLA 

LRCAQELLLANNWVGAQEALOLHESI^GQRLVFCLLELLSRHLE 

EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKSIFSLDTPEQY 

QEAFQKLQNIKYPSATNNTPAKQLLLHICHDLTLAVLSQQMASW 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L-Leucine , M«Me thionine , N=Asparagine , 
P=Proline, Q^lutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=:Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) 








DEAVQAUjRAWRSYDSGSPTIMQEVYSAFLPDGCDHLRDKLGD 
HQS PAT P A F KS LEAF FL YG RLYE FWWS LSRPC PNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDIiRLTEEGERM 
LSTFKELFSE FCHASLQNSQRTVAEVQETLAEM I RQHQKSQLCKS 
TANGPDKNBPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRJJTE 
ANQRMAKF PE S I KAW P F PDVLB CCLVLLL I RSH F PGCLAQEMQQ 
QAQEliLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTMPHLLVTFRDVAIDFSQEEWECLDPAQRDLYRDVMLENYSNL 
ISLDLESSCVTKKLSPEKEIYEMES\PSGRIWGNVSTITFQYNG 
LGDNMECKGNLEGQVSKSEGLYMCVKITCBEKATESHSTSSTFH 
RI I /H YQGKI VKCKE CRQG FS YLSCLI QHEENHNI * KCS EVNKH 
RNTFS KKPS YI * HO\ KFRLGEKPYECMECGKAFrtRT^nT.TnwnK' 
IHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKPYECKECGKAFILGSHLTYHQRVHTGEKPYICKECGKAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKECGKAFISNSNLIQHQRIHTGEKPYKCKECGKAFICGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYIjTQHEKIHGEICHYECKEC 
GKT FVRATQLT YHQR I HTG EKP YKC KECDKAF/ HL WLT I LS EHQ 

rihrgekpyeckqcgr/lfirgshl/nehlrthtgekpyeckec 
grafsrgsehtlhqrihtgekpytcvqcgkdfrcpsqltqhtrl 
hn*eysshkicmhsialasldfahlqeknpen 


5839 


1 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWT\TOFEECLKD\SPRFRAAL 
EEVEGDVAELELKL\DKLVKLCIA\MIDTGKAFCVANKQFMNGI 
RD\ LAQNS \NNDA\ WETKFAPS FLDSLQEM INFHTIL/L* PNS 
EIN*GHS FQNFVKEDLRKFKDAKKQFEMSQ* KRKKIALVKNAPV 
PSRPASLEL * KP PNI LTATRKCFRH I ALDYVLQINVLQS KRRSE 
ILKSMLSFMYAHLAFFHQGYDLFSELGPYMKDLGAQLDRLVGDA 
AKEKREMEQKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RASNAFKTWNRRWFS IQNNQWYQKKFKDNPTWVEDLRI*CTVX 
HCEDIERRFCFEWSP'i'KSCMijOAriSFifr.noawTK'avnTCT Vit 
AYRB KDDESE KLDKKSS PS TGSLDSGNESKE KLLKG ESAIiQRVQ 
CI PGNASCCDCGLADPRWAS INLGI TLCI ECSG IHRSLGVHFS K 
VRSLTLDTWEPELLKLMCELGNDVINRVYEANVEKMGIKKPQPG 
QRQEKEAYIRAKYVERKFVDKIFL*SLSPP\EQQKK\FVSKSSE 
EKRLSISKFGP\GDQVRASAQSSVRSWDSGIQQSSDDGRBSLPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPKM 
AEALAHG ADVNWANSEENKAT PL IQAVLGGSLVTCE FLLQNG AN 
VNQRDVQGRGPLHHATVLGHTGQVCLFLKRGANQHATDEBGKDP 
LS I AVE AANAD I VTLLRLARMNEEMRE S EGLYGQ PGDETYQD I F 
RDFSQMASNNPEKL>JRFQQDSQKF 


5840" 


698 


3610 


KHLHLPRQHLTTLWQI S S PRWRS PQRAFMSALSKTQTQSAPALQ 
GLSS LLQS VTGNP VPASEAASQSTS ASPANTT VYTI KGRKL PS S 
AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGLPSTAFKLP 
SNTKG FTATHNTS PAA P P TE VTI CQS S E VSKPKL\ ES ESTS P S L 
\3MKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLKDSSQEKFYPDTSFQEDEDYRDFEYSGP 
PPSAMMNLQKKPAKSILKSSfCLSDTTEYQPILSSYSHRAQEFGV 
KSAFPPSVRALLDSSENCDRLSSSPGLFGAFSVRGNEPGSDRSP 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
LFSPQNTIJ^APTGHPPISGVEKVLASTISTTSTIEFKNMLKNAS 
RKPSDDKHFGQAPS KGT PS DGVSLSNLTQPSLTATDQQQQBEH Y 
RIETRVSSSCLDLPDSTEEKGAPIETLGYHSASNRRMSGEPIQT 
VES IR V PGKGNRGHGREASRVGWFDLSTSGSS FDNGPSSASELA 
SLGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHr.PS 
VDLSNPFTKE AALAHAAPP PP PGEHSGI PFPTPP PP P ? PGEHSS 
SGGSG VP FS T P P P P P P PVDHSGWP FPAPPLAEHG VAGAVAVF P 
KDHSS LLQGTLAE H FG VLPG PRDHGG PTQRDLNGPGLS R VRESL 
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NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
l,=Leucine, K=Methionine, N-Asparagine, 
PoProline, Q^Glutamine, R-Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyxosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLP S HS LERLGP PKGGGGGGG SNS S SG P PLG PSHRDT I S RSG 1 1 " 

LRSPRPDFRPREPPLSRDPFHSLKRPRPPPARGPPFFAPKRPFF 
PPRY 


5B41 


1900 


762 


GLRiiFLVLTVWPMMKPSWLSRTEFSKRLIjCRTLWCQSGWSSRSY 
TRSMLKMTTS I NRRS RTS TKSTRTS AR PGLTATVS IGLS DSPTW 
RHCWMTARSCSGEXGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 
GDGLDSGIiARRGSAVS ALASGLVEE PMIiGP PFHPTPR FJCAVSAK 
SKEDLVSQGFTBFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QB 
\ VE P MCKES DH I H 1 I ALAQGLQR VH PGWE YMG? R PRAATTNPH I 
FP*GLPSPKVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRGFSWVKWSYFTPFFLSHDPPPMFY 


5842 


307 


1918 


QEPTADFKLRSTCGCGREMTCPDKPGQLINWFICSI^VPRVRla" 

WS S RRPRTRRNLLiLGTACAI YLG FLVSQVGRAS LQHGQAAEKGP 

HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNWYITLRSK 

RSKPANIRGTVXPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 

KLM1,*HLGTLREQTWLRLESDPGGWCGVRE/WRAGGPDFLQPSS 

RESNI RX YS E SAPS WIiS KDDI RRMRLLADS A VAGLR P VS S RSG A 

RLLVLEGGAPGAVLRCGPS PCGLLKQPLDMSE VFAFHLDR I LGL 

NRTLPS VSR KAE F IQDGR P C P 1 1 LWDASLS S ASNDTHS S VKLTW 

GTYQQLLKQKa^ONGRVPKPESGCTEIHHHEWSKMALFDFLLQI 

YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHIIQRKH 

DPRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 

LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 

AHGVKVLPMNE 


5843 


500 


1453 " 


GTARLVTCWVLHGQ*VKKPAWEPGWWb*Q*RCRPKGWGLGAGM 
R3SRMS QP PQCLRRAQS S CCHFMVKbLDDGTFM I PGEKVAHTS h 
DALVTFHQQKPIEPRRELLTQPCRQKDPANVDYEDLFT.YSNAVA 
EEAACPVSAPEEASPKPVLCHQSKERKPSAEM/RQNNHQGSHFL 
LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCBLWT 
LGCPE I HGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGH I SQKP 
LTAPGTKRQKG PHQEGRE VGQLH+GDPRGQELAPNGS ES P I LPG 
VQARAPGLGRA 


5844 


202 


2471 


FDSAVLSSlNVMAVLPGPIiQIiLGVLIiTISLSSIRblQAGAYYGI 
KPLPPQIPPQMPPQIPQYQPLGQQVPHMPIiAKDGLAMGKEMPHL 
QYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKEIPLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PG KPG AMGMPGAKG E I GQ KGE IG PMG I P * PQGPPG PHGLPG I G K 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGL 
PGLPGPPGLPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPXGEGGIVGPQGPPG 
PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL, 
PGPPGPPGPPGPPAVMPPTPPPQGEYLPDMGLGIDGVKPPHAYG 
AKKGKNGG PA YEM PAFTAELTAP F P P VGAP VK FNKLL YNGRQN Y 
NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 

EYKKGFLDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSASLQDKMANPKEKTAMCLVNELARFNRVQPQYKLLNER 
GPAHSKMFSVQLSLGEQTWESEGSSIKKAQQAVGNKALTESTLP 
KPI*KPPKSNVNNNPGCITPTVELNGLAMKRG\KPAIHRPLDPK 
PFPNNRANYNFQVMYNQRYHCPIPKIFYVQLTVGNNEFFGEGKT 
RQAARHNAAMKALQALQNEPIPERSPQNGESGKDMDDDKDANKS 
E I S L VFE I AL KRNMP VS FE VI KES GP PHM KS FVTR VS VG E FS AE 
GEGNSKKLSKKRAATTVLQELKiaPPLPWEKPK\HFFKKRPKT 
IVKAGPEYGQGMNPISRLAQIQQAKXEKEPDYVLLSERGMPRRR 
EFVMQVKVGNEVATGTGPNK30AKKNAAEAMLLQLGYKASTNIJQ 
DQLEKTGBNKGWSGPKPGFPEPTNNTPKGILHLSPDVYQE14EAS 
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ID 
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1 Prpdirrprf 

beginning 
nucleotide 
location 
corresponding 
( to first 
I amino acid 
residue of 
amino acid 
| sequence 


I Predicted end 
1 nucleotide 
I location 
1 corresponding 
1 to first 

1 ami r\fi a r»i 
1 ClUlXIIW (ILlU 

1 residue of 
amino acid 
sequence 


Amino acid eegmenc containing signal peptide 
<A=Alanine, C»Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
JUHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P= Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y»Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, [ 
\=possible nucleotide insertion) 








rhkvisgttlgylspkdmnqpsss ffs is ptsnssatiarellm 

NGTSSTAEAIGLKGSSPTPPCSPVQPSKQLEYLARIQGFQVHYC 
DRQSGKECVTCIiTIjAPVQM'i'FHA IGSS IEASHDQV* YATMLLC 
YGPARKWKAIKMEAWCAHAALLSLIHYL1APSARLEKSKLFALG 


5846 


1126 


! 456 


FSKLIKKTFIIGISGVTNSGKTTLAKNIiQKHLPNCSVISQDDFF 

KPESEIETDXNGFLQYDVLEALNM2KMMSAISCWMESARHSWS 

TDQ2SAEEIPILIIEGFLLFNYKPLDTIWNRSYFLTIPY3ECKR 

RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEVVYUX3T 

KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS /CK* IRK 
LQGVI 


5847 


[" 2769 


505 


APEMEDLSSPDSTLLQGGHNLLSSASFQESVTFKDVIVDFTQEE 
WKQLDPGQRDLFRDVTLENYTHLVSIGLQVSKPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
KKRDDSWS SNLLES WE YEGSLERQQANQQTLPKE I KVTEKT I PS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCN*/CVEKAF 
SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTQHORIflTGE 
KPYKCDECGKA7SQRTHLVQHQRIHTGEKPYTCNECGKAFSQRG 
HFMEHQKIHTGEKPFKCDECDKTFTRSTHIiTQHQKIHTGEKTYK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQKTHTGEKPYDCAECGKSFSYWSSLAQHLKIHTGEKPYKCNEC 
GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 
" SYGSS L I QHRKI HTGER P YKCNECGRAFNQN I H LTQHKR IHTGA 

KPYECA3CGKAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 

HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGBKPYK 

CNECGK\TFSQSTYLIQHQRIHSGBKPFGCNDCGKSFRYRSA^N 
KHQRLHPGI 


584B 


■ 22 


2961 


AAPRR LijRGGDGDRTPR FPL P ALLR PGP P AEAAP ER R KM PA VS K 
GDGMRGIiAVFISDIRNCKSKEAElKRINKELANIRSKFKGDKAli 
D3YS KKK Y VCKLLFIFLLGHD I D FGHMEA VNLLS SNR YTE KQ I G 
YLF 1 SVLVNSNS EL IRLINNAI KNDLASRNPTFMGLALHC I AS V 
GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLLRLYRTSPDL 
VPMGDWTSRWHLLNDQHLGWTAATSLITTLAQKNPEEFKTSV 
S LAVS RLS \R I VTS AS TDLQD YTY * FC PG FLGLS VKLLRLLQC Y 
PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 
I S L 1 1 HHDS E PNLLVRACNQLGQ FLQHRETNLR YLALES M CTLA 
S S E FS HE AVKTH I ET VI NALKTE R DVS VRQRAVDLLYAMCDR SN 
APQIVAEMLS YLETADYS IREE I VLKVA I LAEKYAVDYTW\ YVD 
TILNLIRIAGDYVSEEVWYRVIQIVINRDDVQGYAAKTVFEALQ 
APACHENLVKVGGYILGEFGNLlAGDPRSSPLrQFHLLHSKFHL 
CSVPTRALLLSTYIKFVNLFPEVKPTIQDVLRSDSQLRNADVEL 
QQRAVEYLRLSTVASTDILATVLEEMPPFPERESSIIiAKLKKKK 
GPSTVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPSPSADLLG 
LG AAP PAP AG PP PS SGG SGLLVD VFS DS AS WAPLAPG S EDN FA 
RFVCKNNGVLFENQLLQIGLKSEFRQNLGRMFI FYGNKTSTQFL 
NFTPTLICSDDLQPNLWIiQTKPVDPTVEGGAQVQQWNIECVSD 
FTE AP VLNIQFRYGGT FQNVS VQLP I TLN KFFQPTE MASQDFFQ 
RW KQLSNPQQE VQN I F KAKH PMDTE VTKAK I IG FGS ALL EE VD P 
NPANFVGAG 1 1 HT KTTQ IGCLLRL E PNLQAQ MYRLTLRTS *<EAV 
SQRLCELLSAQF 


5849 


3545 


1895 


KRREIKETVFHHVAQAGLELLSSSNPPSSASRSAGITGMRHQVQ 
P *DPCMSLS P PCFTEEDRFSLE ALQTI HKQMDDDKDGG I E VEES 
DEFIREDMKYiCDATNKHSHLHREDKHITIEDLWKRWKTSEVHNW 
TLEDTLQWLIEFVELPQYEKNFRDNNVKGTTLPRIAVHEPSFMI 
SQLKISDRSHRQKLQLKALDWLFGPLTRPPHNWMKDFILTVSI 
VI G VGGCW FA YTQNKTS KEHVAXMMKDLES LQTAEQS LMD LQER 
LEKAQEENRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 
CEI^RRQYAEQELEQVRMALKKAEKEFEIiRSSWSVPDAJLQKWLQ 
LTHEVEVQYYNIKRQNAE^LAIAKDEAEKIKKXRSTVFGTLHV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=»Aspartic Acid, E* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=»Histidine. I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine. T=Threonine, V- Valine, 
"^Tryptophan, Y-Tyrosine, X=Unkno*m, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion! 








AHSSSLDEVDHKILEAKKALSELTTCLRERLFRWQQI^KiCGFQ 
IAHNSGLPSLTSSLYSDHSWVVMPRVSIPPYPIAGGVDDLDBDT 
PPIVSQFPGTMAKPPGS1ARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPD1LSVSSCPALYRNEEE 
EEAIYFSAEKQWEVP0TASECDSLNSSIGRKQSPP/SKPRDIPN 

iis/deryqemrcp*ripsggil 


5850 


3 


1895 


KAVLNFSASGSVISLTGSNPMHUASMWHLKKNGIIVYLDVPLLN 

licrlklmktdrivgqnsgtsmkdllkfrrqyykkwydarvfce 
sgaspeevadkvlnaikryqdvdsetfistrhvwpedceqkvsa 
effieavieglasdgglfvpakefpklscgewkslvgatyvera 

QILLERCIHPADlPAARLGEMIETAYGENFACSKIAPVRHJbSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
DTGSAVLNG FSRLNKNDKQRIAWAFFPENGVSDFOKAQI IGSQ 
RENGWAVGVESDFDFCQTAIKRIFNDSDFTGFLTVEYGTILSSA 
KS I NWGRLL PQWYHAS AYLDLVSQG FI S FGS PVDVCI PTGNFG 
KILAAVYAKMMGI P I RKFI CASNQNHVWTDFIKTG\HYDLRGKE 
N*AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLBS 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAKW ADRVQDKTCP V I ISSTAHYSKPAPAIMQALKI KE I 
NETSSSQLYLLGSYNALPPLHEALLERTKQQEKMEYQVCAADMN 
VLKS HVEQLVQNQ FI 


5851 


3120 


1802 


RCYLQFIiAIiLLTSTSARAAAAIAAABEPAGSPSVMTRAGDHNRQ 
RGCCGSLADYLTSAKFLLYLGHSLSTWGDRKWHFAVSVFLVELY 
GNSLLLTAVYGLWAGSVLVLGAIIGDWVDKNARLKVAQTSLW 
QNVSVI LCG 1 1 LMMVFLHKHELLTMYHGWVLTSCYILI I TIAN I 
ANLAS TATAI T IQRDWI VWAGEDR S KLANMNATI RR I DQLTN I 
LAP MAVGQI MTFGS PV I GCG F ISGWNLVSMCVE YVTiLWKV YQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
ELEHE QEPT CAS QMAEP FRTFRDG WVS YYNQPVF/ LGWHGS CFP 
LYDCPGL*LHHHRVRLHSGTEWFHPQYFDGS IS YWWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSS YPSLPALLRARSAPGHCTHRSCGPEWRIDS I S RLEMQGA2R 
SGWAQAQPT I LLLVPRLRKSLPS I WG / S LMGF FI TSGPG / W FRQ 
YYFFI SGRH* VLFTESDFY YVAMDFGGHGL3SHYS PGVPYYLQT 
FVS B I RRWAGKKQS VYFRRCGGCSRAP PLITGGGVGSRKQRWP 
ESGAWAliAPGLPAIHGRSWES 


5853 


223 


1346 


RLLGLSRVKGLHG PAASAWISDPETRGD PGGPWGtWRGSDLR PR 
PVSLTGLTLVCK+AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLKPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGIRAAGPGH 
GGVE I PQG/VGSLGARRGLRPSR PS SRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGESE 
SRWVRGNFRTGTAATLIGFSRNPTLNGSENWGSLVSIQEEGPDT 
GWEREKRNPAEMGNPQRWASPIHTPPLGPEILRAMPEALRAMPE 
ALGLRPDPATSVPSALS/QTF/PESWPRSCLRNQGETLGMGPVP 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKGAALNNRENASS*NGY/SRWKQDIRRIENHI1QE 
LXHLCAMIXRVLLERLENTRKLRELTEGRTLDWPQNRITEVSAK 
RQI VTE YRE JCGKRN+ E EKKRDLEGRSRRYNLCI IG I PETEDRAS 
GAE T I KDLLE/ENFP EL KNELD LQMEKAHR I PLK FNEKKAASRH 
I RVTF L / KFQRRN I LQ ASSQR KQVTYKG AKVRLTS D FS PAI LN A 
RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGKYINQELS LKILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLLPSLLMGYSESPPPITDSWAP 
FISLTKH VLSQSQS PLS SNCWI CbSTHTQ* FTALPADLLTWTQS 
NVSLHISYLAIPFLADSFbKPV/L*PGNSAKHLSFKLSSLSMVS 
GRAVALLHL I ASGLTS IQTNTASS KPPI WGY\LSTOTS FI SP PP 
LCLSRTYPNPAHATMVGQVPQSLCGLIFTL/ RTPCRPS ILHPNY 
KI ISTSAWQKVLCFSGSPTI HTSIjHLTTGSS FLSFHP I PG FPAA 
NSALYVSS LKG PPGKNVTI PS PVTGT*QPPHRGSN/RLTVDKDN 
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NO: 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide - ! 
{A=Alanine, OCysteine, D=Aspartic Acid. E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, j 
H=Histidine, I»Isoleucine, K^Lysine, 
L=beucine, M=Methionine, N-Asparagine, j 
P=Proline, Q-Glutamine, R-Arginine, 
S«Serine, T=Threonine , V=Valine, ! 
W«Tryptophan, Y^Tyrosine, X« Unknown, *=Stop j 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) 








FFLSPKPNSL^QLPSQ\TPYQAbTGAAlAGSYPIWENENTLSWlH 

PTFTYNFCLSTPSLFFLCDTN+YLCLPANWSGTCTLVFQAPTIN 

ILPPNQTILISVEASISSSPIRNKWALHUTLLTGLGITAAliGT 

GIAGITTS ITSYQTLFTTLSNTVEDMHTSITS LQRQLDFLVGVT 

LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 

RAAEL*HQVADSWWQGSSLLRVJIPWVAPFLGPLIFLFLI*LMIGP 

CIFNLVSRFISQRLNCFIQASMQKHIDNIFHLCHV*YQSLRGNH 
SEAPEPRP 


5856 


173 


1137 


P WLHG LGL S AVFLF YL * / Y VT FH1YGG 1 1 LLLL I FI S I AG I L Y K 
FQD VLLYF P EQ PSS SRL Y VPM PTG I P HEN I FI RTKDG I RLNLI L 
IRYTGDNSPYSPTI1YFHGNAGNIGHRLPNALLMLVNLKVNLLL 
VDYRGYGKS EGEAS EEGLYLDSEAVLDYVMTS PDLDKTK I YLSG 
RSLG\GAAAIHLASDNSHRISAIMVENTFTiS I PHMASTLFSFFP 
MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPVMKKQ 
LYELS PSRTKRLAI FPDGTHNDTWQCQG YFTALEQF I KEWKSH 
SPEEMAKTSSNVTII 


5857 


1597 


563 


KIi I GKVLVLS WADAMAA FAVE PQG P ALG SEPMMLGSPTSPKPG 
VNAQFLPGFLMGDLPAP VTPQPRS ISGPSVGVMEMRS PLLAGGS 
P PQ P WPAHKD KSGAP P VRS I YDD I S SPGLGS TPLTS R RQ PN 1 S 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 
GDSJjTSEDH\LDDSWGDCI WGFLKASA\S YI LL\QFAQYGGIS * 
NMWMSNTGNWMH I R YQ S KLQAR KALS KDGR I FGESI K IG VKPC I 
DKSVMESSDRCALSSPSLAFTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTSDYQVISDRQTPKKDESLVSKAMEYMFGW | 


5858 


355 


1419 


pphqpaaastsxhqqg^pppppqdsskpwaqgpgpapgvgsapH 

PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 

PS3GVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 

GGPKPGGGPGLSTP3GHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 

GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 

LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 

ALA*NCPKFELG*YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 

MFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 
FAS 


5859 
5860 


307 


1503 


GGSSARPRASSRRMLSRKKTKNEVSKPAEVC^KYVKKf^SPTLR] 
NLMPSFIRHGPTIPRRTDICLPDSSPNAFSTSGDGWSRNQSFL 
RTPIQRTPHEIMRRESNRLSAPSYLARSLADVPREYGSSQSFVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDL FQRM PQNQGRHASG IGR VAATS LGN LTNHGS E DL PL P PG WS 
VDWTMRGRKYY IDHNTNTTHWSH PLEREGLPPGWERVESSEFGT 
YYVDHTNKKAQY\RHPCArTCTSV*STTSCHl/AS/RQQTERNQ 
SLLVPANPYHTAEIPDWI^VYARAPVKYDHILKWELFQLAD1,DT 

YQGMLKLLFMKELEQIVKMYEAYRQALLTELENRKQRCX2WYAGQ 
HGKNF 1 




2956 


1270 


TIRVEEFPLCPGGGKAQLSSASLI^GLLLQPPTPPPLlilXFT| 

LLTjFSRLCGALAGPI I ve phvtavwgknvs lkcli e vneti TQ I 

S WEKI HG KS SQT VAVHHPQ YG FS VQGE YQGR VLFKNYS LNDAT I 

TLHNIG FSDSGKYI CKAVTFPLGNAQ9STTVTVLVEPTVSLI KG 

PD S L I DGGNETVAAI CI AATG KP VAHI DW EGDLGEME STTTS FP 

NETATIISQYKLFPTRFARGRRITCWKHPALEKDIRYSFILDI 

QYAPEVSVTGYDGNWFVGRKGVNLKCNADANPPPFKSVWSRLDG 

Uvvt'iAjljJaAbDNTIiHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 1 

VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP+KIA 

PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIF 

CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 

KKENKNPVNNLIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 

KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV | 


5861 


2051 


1305 

J 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRVVMAm.L ' j 
S EGEQGI PTACAAFAQQPAG/ E PRRGLAGVGEGGPQCS WVNYRC 
rLEFLVSLLGTDLARGRGNS ASGPTAPADS KQL/ML * DVHRRVI 
LE*RMNSGSPARDNAPSQRFCTNLSEGLRFGISPSWREALYGCH | 
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amino acid 
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Predicted end 
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location 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G»Glycine, 
H-Histidine, Ioieoleucine, K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








~~a" ; 


5862 


1556 


483 


P P FQli I MG E I KVS PDYN W FRGT V PIjKK 1 1 VDDDDS KI W S L Y DAG 
PRS IRCPLI FLPPVSGTADVFFRQILALTGWGYRVTAIiQYPVYW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVH S L I LCNS FSDTS I FNQ TWTANS FWLM PA FMLKKI VLGNFSS 
GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 
RD I PVT IMD VF DQS ALS TEAKEEM YKL YPNARRAHLKTGGNF P Y 
LCRSAE VNLYVQI HL/ R / RNSME PNTR PLTHQWS VPRS LRCRKA 

ALASARRSSSVSLAVNDELTRCVLV*SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPIAAPREDTMGPLMVLFCLIiFLYPGLADSAPSCPQN 
VN I SGGT FTLS HG WAPGS LLT YS CPQG LY P S PAS RLCKSSGQ WQ 
TPGATRSLSKAVCKPVRCPAPVSFENGIYTPRLGSYPVGGNVSF 
ECEDGFI \LRGS P VRQCR PNGMW DG ETA VCDNG AG HCPN PG I SL 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 
PICRQPYSYDFPEDVAPALGTSFSHMLGATNPTQKTKESLGRKI 
QIQRSGHLNLYLLLDCSQSVSENDFLI FKESASLMVDRIFSF2I 
NVS VAI ITFAS EPKVLMS VLNDNS RDMTEVIS SLENAN YKDHSN 
GTGTNTYAALNSVYLMMNNQMRLLGMETMAWXQEIRHAIILLVT 
DGK\SHMGGSPKTAVDH IREI LNINQKRNDYLDI YAIGVGKLDV 
DWRELNSLGS KKDGERHAFI LQDTKALHQVFEHMLDVSKLTDTI 
CGVGNMS ANASDQERTPWHVT IKPKSQET\ C\ RGALISDQWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDVFA 
KKNQG I L\EF YGD \DI ALL \ KIiAQK\OCM\STHCQGPSCIiP\ CTM 
\EANLGFLRETFKGSTCR\DHENEL/\AWKQSV\PAHF\VAL\N 
GSKLEHLTLRMGVEWTSCCRGLSPKKKTM\FPNLT\DVRB\WT 
D\QFL\CS\GPQEDESP\CK*E\SGGA\VFLEKRFKLSAGGVWC 
SWGL\YNP\CT^SA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 

Q*SPWLRQHPGGMS*IFLPLLANGHLSPFACPARICRPLKFLPS 
EKATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPS I PPSS PLAC VLKNLKPLQLTPDLKPKCLI FFCNTAWPQ Y 
KLDNDSK*PENGTFEFSILQVLDNSCHKMGKWSEVPDVQAFF\S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTI VGCI FFKTAI I SHFKGGMYLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\ICRCISMYTREHAC 
ACTRV*VYMCMS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/CVY VCVLCVWACMRMSTC VWLVYG * ACTCVWMHM/ CSCTCR/ C 
VHVCCM SMHACB CLCVYLH I CGCAGTRRWWAGSARG S RS CS R L P 
C WAPG PG LS LPG PS CP S VEQG LGGG PGQ LQGRS GE ARLGEH RG W 
GSPAAVCSRNCTVS PRRGADCFEAPDVPKQPPGWGRAS FEERGC 
GGRGWVCAPPLKGPQCCCFSIKPELKAKKKK 


5866 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKKNKGKERRDLDDL 
KKEVAMTEHKMSVEEVCRKYNTDCVQGLTHSKAQEILARDGPNA 
LTPPPTTPEWVKFCRQLFGGFSILLWIGAILCFLAYGIQAGTED 
DPSGDNLYLGIVLAAWI ITGCFSYYQEAXSSKIMESFKNMVPQ 
QALV IREGEKMQVNAEBWVGDLVEI KGGDRVPADLRI I S AHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGVWATGDRTVMGRIATLASGLBVGKTPIAIEIEHFIQLITGV 
AVFLG VS FF I LS L I LG YT WLE AV I FL IG 1 1 VANVP EGLLATVT V 
CLTLTAKRMAkKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTS FDKSS HTWVALF* H/LLGFC 
NRPVFKGGQDNI PVLKRDVAGDASESALLKCI ELSSGS VKLMRE 
RNKKVAE I P FNSTN KYQLS IH ETEDPNDNR YLLVM KG AP ER I LD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSM1GPPRAAVPDAVG 
KCRSAGIKVIMVTGDHPITAKAIAKGVGI I FEGNETVEDIAARL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
J nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amino acid segment containing signal peptide 
(A=Alanine, .OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
J. = Leucine, M=Methionine, N-Asparagine, 
P= Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NIPVSOVNPRDAKACVIHGTDLKDPTSEQIDEltQNHTEIVFAR - 
TS PQQKLI X VEGCQRQGAI VAVTGDGVNDSPALKKADIGVAMGI 
AGSDVS KQAADMI LLDDNFAS FVTGVE EG R L I FDNLK KS IAYTL 
TSNIPEITPPLLPIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESDIMKRQPRNPRTDKLVNTERIjISMAYGQIGMIQALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQVITYEQRK 
WE FTCHTAF FVS I VWQWADL 1 1 CKT RRNS VFQQGMKN KI L I F 

GLF2ETALAAFLSYCPGMDVALRMYPLKPSWWFCAFP YS FLI FV 
YDE I RKL I LRRNPGG WVEKET Y Y 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQAIiDGSRMGKAKVPASKRAPSSPVAKP 
GPVKTLTRKKNKKKKRFWKSKAREVSKKPASGPGAWRPPKAPE 
DFSQNWKALQEWLLKQKSQAPEKPIjVISQMGSKKKPKIIQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFCDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRALALDCEMVG VGPXGEESMAARVS I VNQYGKCVYDKYVKP 
TEPVTDYRTAVSGIRPENLKQGEELEWQKEVAEMLKGRILVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
I LGLQVQQAEHCS IQDAQAAMRLYVMVKKEWESMARDRRP LI»TA 
PDHCS DDA * QS CP AAAAAPLQRQCDQSQGQI TS PQSGNSG ETFS 
ESWQRGVAWCY 


5868 


2122 


833 


ltagashtqdasqstsakypaaaqnl/cvtWai-iredladiwyir 

AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILIiAA 
LES RV* T\MTLDGHNL PSLVCVI TG KGPLRE YYS RL IHQKH FQH 
IQVCT P WLE AED Y PLLLG S ADLGVCLHTS S SOLD LPMKVVDM FG 
CCLPVCAVNFKCLHELVKHEENGIiVFEDSBELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMREDLADI WY IR 
AVT VYDKPAS F F KETPLDLQHRLFM KLGSMHS P FRARS E PEDP V 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV+T\MTLDGHNLPSLVCVITGKGPLRBYYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKVVDMB-G 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNLy CVTNAMREDLADI WYTr" 
AVTVYDKPASPFKETPLDLOHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESR V + T \ MTLDGHNL PS LVCVITGKGP l»REY YS RLI HQ KHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPIiVMDT 


5871 


3 


3465 

< 


fpfcrplrlyskttgdrsamagaagltaevsMkvlerrartkrs 

VLKLti*LSLRRL*LEPTI*NGLLT*CSRLSVFRPLKV\GSVYEP 
LKSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 

cggdqkakiqdslycaagawalalayrridddkgrthelehsai 

KCMRGILYCYMRQADKVQOFKQDPRPTTCLHSVFNVHTGDELLS 
YEE YGHLQ I NAVS LYLLYLVEMI SSGLQ 1 1 YNTDE VS FIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L* KQFNGFNLFGNQGCSWSVI FVDLDAHNRNRQTLCSLLPRESR 
SHNTDAALL PCIS Y PAFALDDE VL FSQTLDKWRKLKGKYGFKR 
FIjRDGYRTS LEDPNRCY YKPAE I KLFDG I ECEFP I FFLYMM IDG 
VFRGNP KQ VQ E YQDLLT P VLHHTTEG Y P W P KYY YVPAD FVE YE 
KNNPGSQKRFPSNCGRIXSKLFLWGQAIiYIIAKEiLADELISPKDI 
DPVQRYVPLKDQRNVSMRFSNQGPLENDLWHVALIAESQRLQV 
FLNTYGIQTQTPQQVEPIQIWPQQELVKAYLQLGINEKLGLSGR 
PDRPIGCLGTSKIYRILGKTVVCYPIIFDLSDFYMSQDVFLLID 
DIKNALQFIKQYWKMHGRPLFLVLIREDNIRGSRFNPILDMLAA 
LKKGIIGGVKVHVDRLQTLISGAWEQLDFLRISDTEELPEFKS 
FEBLEPPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 
3 KLNDCS CLAS Q AI LLG I LLKREG PNF I TKEGTVSDH I ERVYRR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid* E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
Ji=Histidine, I=Isoleucine, K«Lysine, 
L= Leucine, ^Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=-Arginine, 
S=Serine, T=Threonine, V=Valine, 
^-Tryptophan, Y=Tyrosine. X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSQKLWSWRRAASLLSKWDSbAPSITNVLVQGKQVTLGAFG 
KEEEVISNPi^PRVIQNIIYYKCNTHDEREAVIQQELVIHIGWI 
3SNNPELFSGTLKIRIGWIIHAMEYELQIRGGDKPALDLYQLSP 
S E VKQLLLO I LQPQQNG RCWLNRRQ IDGS LNRTPTG FY DR VWQ I 
LERTPNG 1 1 VAGKHLPQQPTLSDMTMYEMNFS LLVEDTLGNI DQ 
PQYRQI WBLLMWS I VLERNPELEFQDKVDLDRLVKEAFNEFQ 
KDQSRhKEl EKQDDMTS FYNTPPLGKRGTCS YLTKAVMNLLLEG 
EVKPNNDDPCLIS 


5B72 


68 


665 


VOG YM YRFVI KINSC YSEKTS I CRHRCCPELPATQPWPTPTVFF 
NI AIDSES LGCI\SFKLFADKV/PKRWKKNFVl»LNTGEKVtiGDK 
GPCFYRIIPG\ LCQGG D FT7HHNG TGG KS L YS KE FDDEN FI / LKH 
TAPGVLSTANAGPTTNGSQFF I CTAKTEDG * QHWFGKVKDGMS 
IVEALERSGSRNGKTS KKI TAANCGQL 


5873" 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSlrn/AGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GECVGPNKCRCFPGYTGKTCSQDVNECGMKPRPCQHRCVNTHGS 
YKCFCLSGHMLM PDATCVNSRT CAM I NCQYS C E DTEEG PQCLCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAIPENSVKEVLRAPGTIKDRIKKliliAHKNSMK 
KKAKI KNVTPE PTRTPTP KVNL<3P FN YEEI VSRGGNSHGG \ KKG 
NEEKMKEGLEDEKREEKAbKD*HRRERPFRG\DVFFPKVNEAGE 
FGLIL\ VQRKALT5 KLEHKADLNI S VDCSFNHG \ I CDW \ KQDR\ 
EDDPDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLLL 
P DLQPQS NFCLLFD YRLAGDKVG KLR VFVKNSNN ALAWEKTTS E 

DEKWKTG KIQLYQGTDATKS 1 1 FEAERGKGKTGEI AVDGVLLVS 
GLCPDSLLSVDD 


5874"" 


2 


3387 


ACPRLARRRRRVRSLRRRRGWLRARWSRGQNNMAARRITQETFD 
AVLQEKAKRYHMDASGEAVSETLQFKAQDLLRAVPRSRAEMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECLEKE\ 
SRDYD VDHSG \ E A \DS VLRGS \ SQ VQ A\ RGRALN I VDQEGS LLG 
. KGETQGLLTAKGGVGKLVTLRNVSTKKI PTVNRI TPKTQGTNQI 
QKNTPSPDVTLGTNPGTEDIQFPIQKIPIXjLDLKNLRIjPRRKMS 
FDI I DKS D VFSRFGI EI I KWAGFHTI KDDI KFSQLFQTLFELET 

etcakmlasfkcslkpe^rdfcfftikflkhsalktprvdnefl 

NMLLDKGAVKTKNCFFEIIKPFDKYIMRLQDRLLKSVTPLLMAC 
NAYELSVKMKTLSNPLDLALAL2TTNSLCRKSLALLGQTFSLAS 
SFRQEKIL*AVGLQDIAPSPAAFPNFEDSTLFGREYIDHLKAWL 
VSSGCPLQVKKAEPEPMREEEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVIEGSLSPKERTLLKEDPAYWFLSDEN 
SLE YKY YKLKLA EMQRMS ENLRGADQKPTS ADCAVRAML YSRAV 
RNL KKKLL P \ WQRRGLLRAQG\ LRG\ WKARRA\TTGTQTLL FLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVDISEAPQTSSPCPSADIDMKDNGRTAE 
KLARFVAQVG\PEIEQF\SI\ENSTDNPDLWFL\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
G PS L EGST PADGLPGEA\ AEDDL/ ALGAPALFTGLLQ VTCFPFG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKDLDFAQQKb\TDK\NIiGFQ\ML0Kf4GWKEGHGLGSLGK 
G IR\SRSACTQQAAWGGSGWG LS PSTCS LPLGS FTAKMAYSWQL 
IFVF 


5875 


296 


1848 


LAALGGLP LWRLS RRG FR EY LLG LS APS ALGG AMRS VS Y VQRVA 
LEFSGSLFPHAICLGDVDNDTLNELWGDTSaKVSVYKNDDSRP 
WLTCS CQGMLTCVG VG DVCNKG KNLLVAVS AEGWFHL FD LT PAK 
VLDASGHHETLIGEEQRPVFKQH I PANTKVMLISDIDGDGCREL 
WGYTDRWRAFRWEELGEGPEHLTGQLVS LKKWMLEGQVDS LS 
VTLGPLGLPELMVSQPGCAYAILLCTWKKDTGSPPASEGPTDGS 
/S GDPS CPRRG AAPD I WP Y PQQECLHS PNWQHQT\SHGTES SGS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


freaictcQ end 
nucleotide 
location 
corresponding 
to first 

residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Aspa rapine 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, TsThreonine, VoValine, 
W«Tryptophan, YoTyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
_ \=*po3sible nucleotide insertion) 


5876 






GL FALCTLI3GTLKLM e em eeadkl l ws vq vd hql fal e kld vtg ~ 

NGHEEWACAWDGQTY 1 1 DHNRTWRFQVDEN IRAFCAGLYACK 
KGRNSPCLVYVTFNQKIYVYWEVQLERMESTNLVKLLETKP\ST 
TACCRSWAWILTOL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
JCTCLIAGEGFF+TPTLPPKGVFGSHCAAAGSITKQ 


"5877 


1122 


224 


HbPLGVPSKVAGAAAMEPQEERETQVAAWLKKIFGDHPI^QYEV 
KPRTTEILHHLSERNRVRDRDVYLVIEDLKQKASEYESEAKYLQ 
DLLMESVNFSPANLSSTGSRYLNALVDSAVALETKDTSLASFIP 
AVNDLTSDLFRTKSKSEE1KIELEKLEKNLTATLVLEKCLQEDV 
KKAELH LSTER \ AKVDNRRQNM \ D FLKAKS BE FR FG I Q AAGEQL 

SARGQ\DAFSVPIQSLVALIRENWPRLKQQTIPLK\KKLESYLD 
LMP\NPSHCSK*RIEEAK\RELA\5IEAELTRRVS\MMEL 


5878 _ 


2030 


1907 


CaTJjGKMAASSSGEKEKERLGGGLGVAGGNSTRERLLSALEDLEV 
LSRELIEMLAISRNQKLLQAGEENQVLELLIHRDGEFQELMKLA 
LNQGKIHHEMQVLEKEVEKRDSDIQQLQKQLKEAEQILATAVYQ 
AKEKLKSIBKARKGAISSEEIIKYAHRISASNAVCAPLTWVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR+INIILILQKSVCEIi 


5879 


950 


2113 


GLWKCMQLQUFHTHRVQP*PTPRQQGPQ\VPVAVIAGNRPNYLY 
RMLRSLLSAQGVSPQMJTVFIDGYYEEPMDWALFGLRGIQHTP 
ISIKNARVSQHYKASLTATFNLFPEAKFAWLEEDLDIAVDFFS 
FLSQS IHIiLEEDDSLYCI SAWNDQG YEHTAEDPALLYRVETMPG 
LGWVLRRSLYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECII 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAY E VEVHRLLS EAE VLDHS KNP CEDS FL PDTEGHTYVAF I R 
MEKDDD FTTWTQLAKCLH I WDLD VRGNHRGLV7RLFRKKNHFLW 
GVPASPYS VKKPPSVTPI FLEPPPKEEGAPGAPEQT 




3 


981 


RLTEAAAAGSUSRAAGWAGSPPTLLPLSPTSPRCAATMASSDED 
GTNGGASEAGEDREAPGKRRRLGFLATAWLTFYDIAMTAGWLVL 
AIAMVRFYMEKGTHRGIiYKSIQKTLKFFQTFALIiSIVHCLIGIV 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESWLFLVAWTVT 
EITR YS FYTFSLLDHLPYFI KWAR YNFFI I LYPVGVAGELLT I Y 

AALPHVKKTGMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 

yfhmlrqrrkvlhg\g+l*krmik*slqtrcffqnnqdylspsf 

NNKNKQLCEISWIVWFLKI 


5880 
5881 " 


113 8 


1324 


slwclvaggi^j^gpssqnplqragilarpreargtfsaltacsa - 
svtskgksssgmwpsaasdrdspvplrppgpvqlpsgtgwvlsd 

* K^RGRCSS / WljSQPQHEREKEVVLLRRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQSEHTIX3HTSVQSVIEKLQEENRLLKQKVTHVEDLNAKWQRYN 
ASRDEYVRGLHAQLRGLQ1PHEPELMRKEISRLNRQLEEKINDC 
AEVKQE LAASRTARDAALE R VQMLEQQ I LAYKDDFMS ERADR ER 
AQSRIQELEEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYLA 
ADALELMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEEIiLRJiVAECCQ 


5882 


26 


441 


UUIHPSPTEAPRAQHLTMDCTWRlLFLVAAATGTHAQVQLLQSG 

SEVKKPGASVMVSCYVSGYTLTKI*SMPn'?VRQAPGKGLE*MGPFD 

LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 




2407 


2216 

i 

1 
I 
I 


SGCVEMLYSHSLEYNPEWISVQSAVAPAQLAI^SDGDL*LHSGE 
RTRRD*QLPEAGGPGLQEPLOLGELDITSDEFTLntrvnp\ \mr d 
HYSKQVELELQQ I EQKS IRDY IQESENIASLHNQ I TACDAVLER 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
ELVDGLWPSALVTAILEAPVTEPRFLEQLQELDAKAAAVREQE 
\RGTAACADVRGVLDRIiRVKAVTKIREFILQKIYSFRKPMTNYQ 
IPQTALLKYRFFYQFLLrGNERATAKElRDEYVETLSKI YLSYYR 
5YLGRLMKVQYEEVAEKDDLMGVEDTAPCKGFFSKPSLRSRNTIF 
rLGTRGSVISPTELEAPILVPHTAQRGEQRYPFEALFRSQHYAL 
jDNSCREYIiFICEFFWSGPAAHDLFHAVMGRTLSMTLKHLDSY 
jADC YDA I A VFLCI H I VLR FRNI AAKR DVPALDR YWEQ VLALL W 
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QUA 

ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

am i no ^ 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


«nino acid seiment containing signal peptide " 
<A=Alanme, C=Cysteine. D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Hlstidme, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Aeparagine, 
P^Proline, Q>Glutamine, R»Arginine, 
S-Serine, T=Threonine, VsValine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 


5883 






PRFELILEM^VQSVRSTDPQRmGtDTRPHYlTRkyAEPSSALV 
SINQTIPNERTMQLLGQLQVEVEKFVLRVAABFSSRKEQLVPLI 
NNYDMMLGVLM\E*ERAADDSKEVESFQQLLNARTQEFIEEI,LS 
PPFGGLVAFVKEAEALIERGQAERLRGEEARVTQiilRGFGSSWK 
SS^SLSQDVMRSFTNFRNGTSIIOGAiT^LIOXLYI-mFHRVXl, 
S QPQ LRAL PARAELINIHHLM VELKKHKPNF 


5884 


2 


1374 


E FPGRR h'RAVM EAGAG AG AGAAG WSC PG PG PT V TTLGS YEAS EG 
CERKKGQRWGSLERRGMQAMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEL 
EETRELAGQHEDDSLELQGLL2DERLASAQQAEVFTXQIQQL0G 
ELRSLREEISLLEHEKESBLKEIEQELHLAQAEIQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSSEPSGSLGLSDYSGLQEELQELRERYHFLNEEYRALQESJ7S 
SLTGOIADLSSERTQRATERWIiQSQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 
EQRRLQRELKCAQNE VLRFQTSHS \SPSHPLPPI PPSS PCLL * A 
LWISALLWCWWAETSS 


5885 


4261 


2522 


G VIiARAS/u< JjK. v PLTG VRACAE PE VGAE PAKVAGAAE P DEDGGR 
SRLRDCGDYTPSERIjGPKGAMIjWFQGAIPAAIATAKRSGAVFVV 
FVAG DDEQS TQMAAS WEDDKVTEAS SNS FVA I K I DTKS E ACLQ F 
SQIYPWCVPSSFFIGDSGIPLEVIAGSVSAnPT.VTOTuirr/T3rtM 
HLLKSETSVANGSQS ESS VST PSAS FEPNNTCENSQSRNAELCE 
IPSTSDTKSDTATGGESAGKATSSQEPSGCSDQRPAEDLNIRVE 
RLTKKr.EERREEKKKEEEQREIKKEIERRKTGKEMLDYKRKQEE 
ELTKRMLEERNREKAEDRAARERIKQQIALDRAERAAPPa?rrvi? 

EVEAAKAAALLAKQAEMEVKRESYARERSTVARIQFRLPDGSSF 
TNQ F PSDA PL E E ARQ FAAQTVGNT YGNFS LATM FPR R EFTKED Y 

kkklldlelapsaswli*p/alfinf*agrptasivhsssgdiw 

TLLGTVLYPFLAIWRLISNFLFSNPPPTOTSVRVTSSEPPNPAS 
SSKSEKREPWKRVLEKRGDDFKKEGKIYRiRTQDDGEDE^ 


5886 ~ 


900 


467 


AAUBGRRSRLSRSWPTGP3KSPSGVRCCG\RR\AWEDKDEFLDV 
IYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 
YLQ I DEEEYGGTWELTKEGFMTS FA/ I VHGHLDHLLHCH PL * LM 
VYSSQVLPI QS KGPS 


5887 " 


86 


1341 


P^RGRALTLKKCjPRPGVAPPSLGTCHKSDPGftPAAQSOPPSP^S 

GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 

EVLLEALFLTVDPYMRVAAKRLKKGDTM^QQVAKVVESKNVAL 

PKGTIVLASPGWTTHSISDGKDLEKLLTEWPDTIPLSLALGTVG 

MPGLTAYFGLLEICGVKGGETVMVNAAAGAVGSWGQIAKLKGC 

KWGAVGSDEKVAYLQKLGFDWFNYKTVESLEETLKKASPDGY 

DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 

PPEIGIYOELRMEAFWYRWQGDARQKALKDLLKWVLELPYFVI 

D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKTIVKA 




193 7 


104 

: 

i 
i 

i 
c 

s 


APGCRGCRATRCPCRGPRWUSLGDEAARSPAAPGGAPGLLGLRE^ 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGL^GLQGPP 
PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACSVP WTGD5QFCSQKAVI YSLNFTANPPQRVFELVDQINPS I 

FCIHITN\*NLHYPLLIQKYL/NENNFDTLMKTSDGFTLNAESY 

VSFTTKLDIPTAAKYFYf:VDr.nTcnei?T ocnnor mrm* 

* ■* *«vu4/iir A"rtivic»i\jvfijuisuSrLRFPSSLTSSLCrDNNP 

^VAFLVNQAVKCTRKINLEQCEE I EALSMAF YSS PEI LRVPDSRK 
iCVPITVQSIVIQSLNKTLTRREDTDVLQPTLVNAGHFSLCVNW 
jEVKYSLTYTDAGEVTKADLSFVLGTVSSWVPLQQKFEIHFLQ 
2NTQPVPLSGNPGYWGLPLAAGFQPHKGSGII0TTNRYGQLTI 
.HSTTEQDCLALEGVRTPVLFGYTMQSGCKLRLTGALPCXJLV^Q 
CVKS LLWGCG F PD YVAP FGNS QGP/ ADMLD W V P IH F I TQS FNRK 
>S CQ LPGALVI E VKWTKYGS LLNPQ AKI VNVTANL I S S S F PE AN 
GNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 



394 



WO 01/53312 



PCT/USO0/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid' 
sequence 


uuu aL - lu segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S~Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, " 
\=possible nucleotide insertion) 


5888 


375 


2302 


DLCRTPGVAMQRADSEQPSkRPRtDDSPRTPSNTPSAEADWSPG" 
LELHPDYKTWGPEQVCSFLRRGGFEEPVLLKIJIRENEITGALLP 
CLDESRFENLGVSSLGERXKLLSYIQRLVQIHVDTMKVINDPIH 
GH I ELHPLLVR 1 1 DTPQ FQRLR YI KQLGGG Y YVFPG ASHNRFEH 

SLGVGYliAGCbVHALGEKQPELQISSRDVLCVpiAGLCHDLGHG 
PFSHMFDGRFIPLARPEVKWTHECX5SVMMFFHT twcmptvd\/wd 

QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 
IVSNKRNG1DVDKWDYFARDCHHLGIQNNFDYKRFIKFARVCEV 
DNELRICARDKEVGJ^YDMFHTRNSLHRJRAYQHKVGNIIDTMIT 
DAFLKADDYIEITGAGGICKYRISTAIDDMEAYTKLTDNIFLEIL 
YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKLKAEDF IVDVINMDYGMQEKNP I DHVS 
FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\LIRVYCKKVDRKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKIPTRLPRRLPKSRV\QLFXDDPM 


5889 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRDLDPYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVI I AGNNDSKAKQWSKIKEET 
LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FGlFIb\DLASMTSIRQFVQKFKMKKI?IiHVLINNAGVMMVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLLDTIiKESGSPGHSARWTVS 
SATHYVAELNMDDLOSSACYSPHAAYAQSKLALVLFTYHL0RLL 
A7\EGSHVTA14WDPG\A/NTDLYKHVFWATRLAKKLLGWLLFKTP 
DEGAWTS IYAAVTPELEGVGGRYLYNKKETKSLHVTYNQKLQOQ 
LWS KS C EMTGVLDVTL 


5890 

i 

! 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS 
GRP DPR PAS AAGGHAGE RM S QRDTLVHL FAGGCGGT VGAI LTC P 
LEWKTRLQS S S VTL Y I S E VQLNTMAG AS VNR WS PGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
P DS TQ VHM I S AANAG FTAI TATN PI WL I KTRLQL * / SQGTAG KR 

RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYES1 
KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQIP\NTAIMMATYELWYHjNG 


5891 


1322 


200 


P-RRGW3AAGRAVPVAFCSRISASSPRRPRGAVRL0SGTEAACRS 

grpdprpasaagghagermsqrdtlvhlfaggcggtvgaii.tcp 

LEWKTRLQSSSVTLYISEVQLNTMAGASVNRWS PGPLHCLKV 

ilekegprslfrglgpnlvgvapsraiyfaaysnckeklndvfd 

PDSTQVHMI SAAMAGFTAI TATNPIWLI KTRLQL* /SQGTAGKR 

rmgafecvrkvyqtdglkgfyrgmsasyagisetvihfviyesi 
kqklleyktastmendeesvkeasdfvgmmlaaatsk\lvatti 
ayphewrtrlreegtkyrsffqtlsllvqeegygslyrgltth 
lvrqip \ntaimmatyelwyllng 


5892 
" 5893 " 


1764 


379 


vvlrvcgrlsvnsavssrtggwsagltcamqrlqvvlghlrgpa" 

DS GWM PQ AAP CLS GAPHASAADWWHGRRTAT n? flr or r» ttw tvp 
TPDELLSAVMTAVLKDVNLRPEQLGDICVGNVLQPGAGAIMARI 

aqflsdipetvplstvnrqcssglqavasiaggirngsydigma 
cgvesmsladrgnpgnitsrlmekekardclipmgitsenvaer 
fgisrekqdtfalasqqkaaraqskgcfqaexvpvtttvhddkg 
tkrsitvtqdegirpsttmeglaklkpafickdgsttagnssqvs 
dgaaaillarrskaeelglpilgvlrsyawgvppdimgigpay 
ai pvalqkagltvsdvdi feine\afasqaaycveklrlpp +eg 

*tplggasgp*ghplglhwghvqvitlaq*s*sargkrayrsgc 
pcaigswngs plpvfeypwgt 




3 


1653 

< 


ILSKRRCQKAKTKELMAKKVAVIGAGVSGLISLKCCVDEGLEPT 
CFERTEDIGGVWRFKENVEDGRASIYQSWTNTSKEMSCFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGOWKWTQSNGKEQSAVFDAVMVCSGHHILPHIPLKSFP 
3MERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
VAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
rAVKMMIEQQMNRWFNHENYGLEPQNKYIMKEPVLNDDVPSRLL 
KVKSTVKELTETSAI FEDGTVEENIDVI I FATGYS FS FPF 
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ID 
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beginniag 

nucleotide 

location 

corresponding 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(AeAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSbVKVENNMVSLYKYIFPAHLDKSTLACIGLIQPLGSIPPT 
AELQARWVTRVFKGLCSLPSERTMMMDI IKRNEKRIDLFGESQS 
QTLQTN YVD YLDELALE I GAKPD FCS LL FKD PKLAVRL YFG P CN 

SY*YRLVGPGQWEGARNAIFTQKQRILKPLKTRALKDSSNFSVS 
FLLKILGLLAVWAFF\CQLQW$ 


5894 
589S 


174 


1673 


RYSPKKVLQNKESSLKLGMATALVSAHSIiAPLNLKKEGLRVVRE* 
DHYSTWEQGFKLQGNSKGUjQEPLCKQFROtjRYEETTGP REALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLIILPKEIiQARVQEH 
HPESREDWWLEDLQLDLGETGQQVDPDQPKKQKILVEEMAPL 
KGVQEQQVRHECEVTKPBKEKGEETRIENGKLIWTDSCGRVBS 
SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVLS*ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 

ihtgekpylcxhcgknfrrsshlnrhqrihsqeepcbckecgkt 

FS Q ALLLTHHQR I H S H S KS HQCNECG KAFS LTS DL I RHHR I HTG 

EKPFKCNICQKAFRLNSHLAQHVRIHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 




2967 


. 66 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGS ' 

KRLFVSDGTOGCLP\a*AAAGRARGRAEVLISTVGPEDCVVPFLT 

RPKVPVLQLDSGNYLFSTSAICRYFFNLLSGWEQDDLTNOWLEW 

EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 

RQ\NCPFIiAGETESLADrVLWGALYPLLQDPAYLPEELSALHSW 

FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 

EGKGLSP I E PE3EELATLSEEE I AMAVTAWEKGLESLPPLRPQQ 

NPVLPVAGEPJJVLITSALPY VNNVPHLGN I IGCVLSADVFARYS 

RLRQWNTL YLCGTDE YGTATETKAL \ EEGLTPQE ICDKYH I I HA 

□IY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 

TVEQLR CEHCARF \LADR F VEG VC P FCG YE E ARGDQCDKCGKL I 

NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 

G FED K\ VFYVWFDATI G YLS I TANYTDQWER WW \ KNPE Q VDL YQ 

FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\BLLNNLGNFIKRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQIIYIIQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\KRI KGSEADRQRAGTVTGLAVNI AALLSVML 
QPYMPTVSAT IQAQLQLP P PACS ILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQ I ES LRQRFGGGQAKTS PKPAWBTVTTAKPOQIQA 
LMDE VTKQGNI VRELKAQKADKNE VAASVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5896 


29*7 


86 


HPSLLGAIPrYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLP VLAAAGRARGRAEVLISTVG P EDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETES LAD I VLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
B3KGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLI TS ALP YVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATET KAL\ EEGLTPQE ICDKYH I IHA 
D I Y \ RW FN I S FD I FGRTTTPQQ\TKI T>\ QDI FQQLLKRGF VLQD 
TVEQLRCEHCAR F\ LADRFVEGVCPFCG YEEARGDQCDKCGKLI 
«AVHi.K.K.pgcKVCRSCPVVQSSQHLFLDLPKLBKRLEEWLGRTL 
PGSDWTPNAQF I T P FFG FR EW PS KPRWQ + TRDLK \WGNPGTP * E 
G FED K\ VF YVW FDAT IG YLS I TAN Y TDQ WER WW \ KNPEQ VDL YQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTLVVSHLIATEYLNYEDG 
K\FS KSRGVGVFRDM\AHDTG I PPDISR FYL\LY IRPEGK\DS A 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQIESLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 
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Predicted 
beginning 
nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^isoleucine, K=Lysine, 
T.=» Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Axginine, 
S=Serine, T-Threonine, V=valine, 
W«Tryptophan, Y-Tyrosine, X^Unknovn, *=Stop 
Codon, /=po5sible nucleotide deletion, 
\»possible nucleotide insertion) " 








LWDBVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
K P P EAP KG K KKK 


5897 


29*7 


86 


hpsllgaipfypppsspWppplylpwnshrksrhfinqrgikgb 
mrlfvsdgvpgclpvlaaagrargraevlistvgpedcwpflt 

R P KVPVLQLDSGNYLFSTSA I CRY FF\LLS GWEQDDLTNQWL E W 
EATEIjQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESliADIVLWGALYPLLQDPAYLPEELSALHSW 
FQT£*STQ \ E POQR \ AARRLVLKQ\ QGVLALR \ PYLQKQPQPS PA 
EGKGLS P I E PEEEELATLSEEE I AMAVTAWEXGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYS 
R LRQWNTLYLCGTDE YGTATETKAlA EEGLTPQE I CDK YH I II IA 
DIYNRWFNISFDIFGRTTTPQQXTKITXQDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLT 
NAVSLKKPQCKVCRSCPVVQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWn*TPnr.Tr\wrMDr"rn*i7 

GFEDK\VFYVWFDATrGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLEIjQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\KRI KGSEADRQRAGTVTGIiAVNI AALLS VML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQI eslrqrfgggqakts PKPAWETVTTAKPQQIQA 

IiMDSVTKQGNIVHELKAQKADKNEVAAEVAKTJ.DLKKQLAVAEG 
KPPSAPKGKKKK 


5898 


2967 


86 


HPSoUSAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 

mrlfvsdgvpgclpvlaaagrargraevlistvgpedcwpflt 

RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDIiTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVIiGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLliQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVIiALR\PYLQKQPQPSPA 
EGKGLSPI EPEEEBLATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFKISFDIFGRTTrPQQ\TKIT\QDIFQQLLKRGFVLQD 

tveqlrcehcarf\ladrfvegvcpfcgyeeargdqcdkcgkli 

NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLBEWLGRTL 
PGSDMTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK \VFYVW FDATIG YLSI TANYTDQWERWVJ\KNPEQVDLYQ 
FM\AKD^fVPFHSXiVFPSSALGAEDNYTL\VSHLIATEYIiKYEDG 
K\ FS KSRG VGVFRDM \ AHDTG I P PD I SR FYL \ L Y IRP EG K\ DS A 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LT PDDQRLLA\ I IVTIjEIiQH YHQ\ LLEKVR I RDALRS I LT I S \RH 
GNQYI \QVNEPW\KR IKGSEADRQRAGTVTGLAVNIAALLSVKL 
QP YMPTVSATIQAQLQIiPPPACS ILLTNFLCTLPAGKQI GTVS P 
LFQKLFJTOQIESIaRQRFGGGQAKTSPKPAWETVTTAKPQOIQA 
LMDEVTKQGNlVREIjKAQKADKN£VAAEVAKLr.DLKKQLAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


NCPKSKEPNG\n^PSLPSPU?AAMALSDVDVKKQIKHMMAFIEQ " 

EANEKAEBIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 

QQKKILMSTMRNQARLKVLRARNDLISDLLSEAKLRLSRIVEDP 

EVYQGLLDKLVLQGLLRLLEPVMIVRCRP\QDLLLVEAAVQKAI 

PEYMTISQKHVEV\QIDKEA*LAVECSWEW?EVYSGNQRIKVSN 

TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


64 


1469 


KAASRDSPCLEFCPLCGVSSHDI^HRMWYHRLSHLHSRLQDLLK 
GGV I Y P ALPQ PNFKS LL PLAVH WHHTAS KSLTCAWQQHEDHFEL 
KYANTVMRFDYVWLRDHCRSASCYNSKTHQRSLDTASVDLCIKP 
KTIRIiDETTLFFTWPDGHVTKYDLNWLVKNSYEGQKQKVIQPRI 
LWNAE I YQ0AQVPS VDCQS FLETNEGLKKFLQNFLLYGIAFVEN 
VPPTQEHTEKLAERISIilRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTTYFQEPCGIQVFHCLKHEGTGGRTLLVDGFYAAEQVL 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R*Arginine, 
S»Serine, T-Threonine, V*Valine, 
W-Tryptophan, Y-Tyrosine, X« Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPBEFELLSKSA1\KHEYIEDVGECHQPHDWDWAQS*ISTHG 
/ YKELY LI RYNNYDRAV INTVPYDWHRW YTAHRTLTI ELRRPE 
NE FWVKL KPG RVL F IDNWR VLHGRECFTG YRQLCG CYLTRDDVL 
NTARLLGLQA 


5901 




2121 


VAIEQTSLXMMQAVGGAPAR PTGEY ICNQCGAKYTS LDSFQTHL 
KTHLDTVLPKLTCPQCNKEFPNQESLLKHVTIHFMITSTYYICE 
SCDKQFTSVDDLQKHliLDMHTFVFFRCTliCQEVFDSJCVSIQLHL 
\ AVKHS N E KKVYR CTS CNWDFRNETDLQLHVKHNHLEN^JG KVH K 
CI FCGESFGTEVELQCHITTHSKKYNCKFCSKAFHAI ILLEKHL 
REKHCVFETKTPNCGTNGASEQVQKEEVELQTLLTNSQESHNSH 
DGS EED VDTS E PM YG CD ICX5AAYTMETLLQNHQLRDH N I RPG E S 
AIVKKKAELIKGNYKCNVCSRTFFSENGIjREr-IMQTHLGPVKHYM 
CPICGERFPSUiTLTEHKVTHSKSLDTGNCRICKMPLQSEEEFIi 
EHCQMKPDLRNSLTGFrlCWCMQTVTSTljELKIHGTFHMQKTGN 
GSAVQTTGRGQHVQKLYKCASCLKEFRSKQDLVKLDINGLPYGL 
CAGCWLSKSASPGINVPPGTNRPGLGQNENLSA I EGKGKVGGL 
KTRCS*LATFKF*VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
QVSPMPRISPSQSDEKKTYQCIKCQMVFYNEWDIQVHVANHKID 
EGLNHECKIjCSQTFDSPAKLQCHLIEHSFEGMGGTFKCPVCFTV 
FVQANKLQQHIFSAHGQEDKIYDCTQCPQKFFFQTELQNHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPS I RQS IGSTSVSRWLTSLFTYLDHTADVQ * V* REF 
I PLXPRQ* ED * MFQSWLHAWGDTLEEAFEQCAMAMFG YMTDTGT 
VEPLQTVKVETQGDDLQSLLFHFLDEWLYKFSADEFFIP\GWGE 
EFS LSKKPQGTEVKA ITYSAMQVYNEEN PEVFVI ID I 


~5903 


210* 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT 
PAL FALS AVPGGAAS PMPPSGLRLLP LLLPLLWLLVLT PGRPAA 
GLS TCKTI DMELVKRKR I EAI RGQ I LSKLRLAS PPSQGE VPPG P 
LPEAVLALYNSTRDRVAGESAEPEPEPEADYYAKEVTRVLMVET 
HNE I YDKFKQS THS I YM FFNTS ELRE AVPE P VLLS RAELR LLRL 

kl kveqh vel yqk ys nns wr ylsnrl lap s ds pe wls fd vtg vv 
rqwlsrggeiegfrlsahcscdsrdnt^qvdingfttgrVrgdl 
at 1 hgmnrp fl llmat pleraqhlq s \ s rhrqal\dtn y\ c fs f 
hggrnclrc/vhc*hlifrkdl\gw\kwi\he\pkgyhanfc\l 

G PCP Y 1 WSLDTQYS KVLAL YNQ \HK PG \ AS AAP \ CCVP QALE P \ 
LPIVYY\VGRKPKVEQIjSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAINTFKEEQRLIYEELIKEEKTTNNELSAISRKIDTW 
ALGNSETEKAFRAISSKVPVDKVTPSTLPEEVLDFEKFLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKESIQIWKTKKQQKREEIFKLKEKADNTP 
VLFHN KQEDNQKQK E EQRKKQKLAVE AWKKQKS I EMSMKCASQL 
KEEEEKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKEI 
RBKAEKAEKRKNAADEISRFQERDLHKLELKILDRQAKEDEKSQ 
KQRRLAKLKEKVENNVSRDPSRLY/NTHQRLGRTNQKDRTNRLW 
ATSTYPT+GYSNLETRNTEKSMR 


5905 


287 


2912 


MASFPPRVNEKEIVRLRTtGELLAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKKKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGRI KI WDVYTGKLLLNLVDHTG WRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNWVY\ 
SCAFS PDSSMLCSVGASKAWAAILV*LRLCWHHSHT3ATMVLS 
W AER VAS LATGLG ATFTI G * SNLAFVLQG VLY VHR CWSMST FCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNGFS VLFFGI LSDSRDILRL* FNLKFVLI FF * K* CIVSVQK 
KKKPKR I ALLQEERLS *DKPPSSHLI *QTEVNIRI LFRAI LHS * 
LLI FR I * NCI * T YS * I IDP F YIQMT YDRG* FGKNKMVKF+ FIEM 
* LYYFHKI AFSFCNW*HPCCLPKKFHLAVNILFACS ICFSS * A 
QVGDP SLL*TSDYLKGRCQWSNNLLTLR FLSVY FFKNLWSGKK 
REGGL*YLTLFISVYFS*LVFGINGFUYSFWKLHCLYFMFRLI 
FKLTFNRNI*NRICMSALINLKTDFNLTMTLSIFFKLHIYNA* 
YNLN * I + QF + YKMCHFVLCMS E * S YNI CLFI AGF\ LWNMDKYTM 
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ID 

NO: 


rreaictea 
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nucleotide 
location 

rnrrp cnnnH ■» 
lu^. l tisj^ioiicung 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine r G=Glycine, 
H^Histidine, I=Isoleucine, K«=Lysine, 

Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








^RKLEGHHHDWACDFSPDGALLATASYDTRVYIWDPHNGDILM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVR 
FWR I DED Y P VQVAPLSNGLCCAFSTDGS VLAAGTHDGS V YFWAT 
PRQVPSLQHLCRMS I RRVMPTQE VQELP I PS KLLE FLS YR I 


5906 


146 


2038 


REGAGSGRMASGA\ YN P Y I E 1 1 EQ PRQRGMR FR Y KCEGRS AG S t 
PGEHSTDNNRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DLVGKDCRD\GYYEAEFGQE\RRP\LFFQN\LGIRCVKKKEVKE 
A\ I ITR\ I KAG INPFDVP*KQLNDI EDCDLDWRLWFRVFLPDG 
HGNL\ TTALPPV\ VSS P I YDNRA PNTAELR VCR VNKNCGS VRGG 
DEIFLLCDKVQKDDIEVRFVLNDWEAKG1FSQADVHRQVAIVFK 
TPPYCKAITEPVTVKMQLRRPSDQBVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAGITVNFPERPRPGLLGSIGEGRYPKKEPNLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSSFSTRTLPSNSQG I PPFLRIPVGNDLNASNACI YNN 
ADDIVGMEASSMPSADLYGISDPNMLSNCSVNMMTTSSDSMGET 
DNPR LIiSMNLENPSCNS VLDPRDLRQLHQMS S S S MSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQUSQY 
SGIGSMQNEQLSDSFPYEFFQV 


5907 


99 


1873 


TYLLSSWSS»«WLDTKIKSQVKV/RiteHKKlSWPy^PAKQWGK 
KATSKVPSAPHFVHPNDHANREAELKXKWVEEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQELNMFPQLDDEATRK 
AYYKEFRKWEYSDVILEVLDARDPLGCRCPQMEEAVLRAQGNK 
KLVLVL^IDLVPKEWEKWLDYLRNELPTVAFKASTQHQVKNL 
NRCSVPVDQASESLLKS KACFGAENLMRVLGNYCRLGE VRTH IR 
VGWGLPNVGKSSLINSLKRSRACSVGAVPG1TKFMQEVYLDKF 
IRIjLDAPGIVPGPNSEVGTILRNCVHVQKLADPVTPVETILQRC 
NLEE I SNYYG VS G FQTTE H FLTAVAHRJLGKKKKGGL YS QEQAAK 
AVIiADWVSGKISFYIPPPATHTLPTHLSAElVKEMTEVFDIEDT 
EQANEDTMECLATG ESDE LLGDTD P LEME I KLLHS PMTKIADAI 
ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
S^DRRSVLQRIMETDPLQQGQAIjASALKNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCGIKKRGEGSGSPSPASGGFQDGCQIP3PSLPSEEETHPHTRA 
HTRTLRATLTORPPRSHSTRIjRFPMPLDGDGGLASWK/PMRER* 
GWRRPAKAAGASLGVAATGKRGCRMSKR YLQKATKGKLLI 1 1 FI 
VTLWGKWSSANHHKAHHVKTGTCE WALHRCCNKNKI EERSQT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKGWSCSSGNKVKTTRVTH 


5909 


1 

* 


5002 


PAI PGSTI I WAPGSHSAARADGRHGSIjPSQSQAPGAI/CGARAPP 
SSNLRADRSMICAQARAGKNLYHNRFLGLAAMAFPSRNSQSLRR 
CKEPI R YS YNPDQFHNMDLRGG PHDG VTI PRSTSDTDLVTS DS R 
STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMYLIDEVLS 
ENFLDYKNRGVNGSHRGQIIWKIDASSYFVEPETKICFKYYHGV 
SGALRATTPS VTVKNSAAPI FKS IGADETVQGQGSRRLI S FSLS 
DFQAMGLKKGMFFNPDPYLKISIQPGKHSIFPAIiPHHGQERRSK 
I IGNTVNP I WQAEQFS FVSLPTDVLE I EVKDKFAKS RPIIKRFL 
GKLSMPVQRLLERHAIGDRWSYTLGRRLPTDHVSGQliQFRFEI 
TSSIHPDDEEISLSTEPESAQIQDSPMNWLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGEEASALLLE 
iA^c.MJr/\a i Wiftf iictih>Ai iwoKA^KbEBEKEQEEEGDVSTLiEOG 
EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 
IHTLLHSMPSAQGGSAAEEEDGAEBESTLKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 
SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 
I DE P LPPNWE AR I DSHGRVF YVDHVNRTTTWQR PTAAATPDGMR 
RSGSIQQMEQLNRRYQNIQRT1ATERSEEDSGSQSCEQAPAGGG 
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SEQ 
ID 
NO: 


1 Predicted 

1 beginning 
nucleotide 
location 

I corresponding 
to first 
amino acid 
residue of 
amino acid 

I sequence • 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acia segment containing signal peptide ■ 
(A=Alanine, C=Cysteine f D-Aspartic Acid, E- 
Glutamic Acid, P-Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L-Leucine, Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon. /^possible nucleotide deletion, 
\=possible nucleotide insertion) 1 


5910 






Wi^usUiiEABSSQysi.DLKREGSLSPVNSUKiTLLLQSPAVKFI 
TNPEFFTVLHANYSAyRVFTSSTCLKHMILKVRRDARNFERYQH 
NRDLVNFINMFADTRLELPRGWEIKTDQQGKSFFVDHNSRATTF 
IDPRIPLQNGRLPNHLTHRQHLQRLRSYSAGEASEVSRNRGASL 
LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLQE 
RQPSLARNHTLREKIHYIRTEGNHGLEKLSCDADLVILLSLFBE 
E I MS YVPLQAAFH PG YS FS P RCS P CSS PQNS PGLQ RASARAP S P 

YRRDFEAKLRNFYRKLEAKGFGQGPGKIKLI IRRDHLLEGTFNQ 
VMAYSRKELQRNKLYVTFVGEEGLDYSGPSRBFFFLLSQEIjFNP 
YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILGNIiALI 
HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDEEFHQSLQW 

-mkdnnitdildltftvneevfgqvterelksggantqvteknkk 
eyiermvkwrvergwcqtealvrgfyswdsrlvsvfdarele 
lviagtaeidlndwrnnteyrggyhdghlvirwfwaaverfnne 
qrlrllqfvtgtssvpyegfaappwbpmglrrflp*kkwgkits 
lpprg\htclqpdwdlptvsprtpmlyek\llta\veetstfgt 


5911 * 


j 1526 


446 


vaefaamepgrtqikldprytadllbvlktnygipsacfsqp^t 

AAQLLRALGP VE LALTS I LTLLALGS I AI FLEDAVYLYKNTLCP 
I KRRTLLWKSSAPTWSVLCCFGLWI PRSLVLVEMTITSFYAVr 
FYLLMLVMVEGFGGKEAVLRTLRDTPMfdVHTCPCCCCCPCCPRL 
LLTRK KLQ \ R * CWALSNTPS * R * R* P WWACFSS PTASMTQQTFL 
RGAQLYGSTLSSA/CSTLLALWTLGIISRQARLHLGEQNMGAKF 
ALFQVLL I LTALQPS I FS VLANGGQ I ACS PPYS S KTR3QVMNCH 

LLILETFLMTVLTRf4YYRRKDIIKVGYETFSSPDLDLNLKALRWM 
AWTMKGCCTH V«««-HWau*llM 


5912 


109 


595 


QoPLAPCIQGKGIiEMRSPKPOSFIIRSSHSGAGLLVKNPSTPVF 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GTIS AHCNLRLPS SSNS PAPAS * LAG ITG VCHHAQL I FVFLVET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 




924 


277 


MILNKALMLUALALTTVMSPCGGEDIVADHVASYGVNLYQSYGP 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
IAVLKHNLNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLX 
CL VDN I FPP WNI TWLSNGHS VTEG VSETR PS S PKS DH Fr iT »QDQ 
VTS PS FP FE * * DL * TAKV EQLGAW FE PLLKHWGAE I PTTL 


5S13 j 
5914 


46 


1198 


OLRMAGAEGAAGKQSELEPWSLVDVLEEDEELBNEACAVLQOS 
DS E KCS YS QG S VKRQAL YACS TCTP EGEE PAG I CLACS YECHGS 
HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEKVCQACMKRCSFLWAYAAQLAVTKISTXGMMDWCGTLM 
E * /DDQEVI KPENGBHQDSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGSSSESDLQTVFKNESLNAESKSGCKLQELKAKQLIKKDTAT 
YW PLNWRS KLCTCQDCMKMYGDLD VLFLTDE YDTVLA YEN KG KI 
AQATDRSDPLMDTLSSMNRVQQVEL 1 C/G IQ * FED 


5915 


960 


124 


MLGGSELPPEEALFIQVASMNQRRVDFYLASIEDMIjVAl/GGRN 
ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNtiLCYDHRTDWJEERRPMTTARGWHSMCS^GDS 
I YS IGGSDDN I ES ME R FD VLG VE AYS PQCNQ WTR VAP LLHANSE 
SG VAVWEGR I YI LGG YS WENTAFSKTVQV YDREADKWSRGVDLP 
KAI AGGSACFI AP* SLGQRTRKRKAKARGTRTGASDPSCASWDH 
PHRHLPGLCRPAATS 


. 5916 


1604 


703 


FPGRPTRPIiKLGRRPKRAPT TnaDwr , trppDnt>^/-»r>r>i-iVT ~ „ — 
* i afmiw wmuuuwR i Ayrtyji^u^jr'KFKTCPPGAIjQAPFA 

PAS RAEG PVAVWNGHTEG P APARS AP KE P PGL P R PLGS FPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGL+ *AAGPAAH 


5917 [ 


256 
1343 


633 < 
I 

827 , 


5PRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
rVTGAVHRHLNHVAGI IPWVLHSQLKPTAATAQDQWTSQQYPDH 
5 TRLI LQ*NQATADKNN* TTALLQPHQRL \ VS PRMAEA 

iHQILTYLEP/ICLWNYNKIIiTVFLTKSVLBl*KFIHTPQTYR 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



SEQ 
ID 

NO: 



-5918 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1247 



Anuno acid segment containing signal peptide 
(A= Alanine, C=*Cysteine, D=:Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
L=Leucine, M°Methionine, N=Aspaxagine, 
P= Proline, Q-Glutamine, R=Arginine, 
S=Serine / T«Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibIe nucleotide in sertion) 
? * NDFFGI KK V YVS RRLRKTS F/ RLAVTFLE QA WS KECVP VPQ 
7MEHLL PS LLSLAS D P VPNVR VLLAKALRQMLLB KA Y FRNAGN P 
HLEVIEETILALQSDRDQDVSPFAALEPKRRNIIDT AVLBKQN 
EGAQVARRR3RRQWRAGRCGRGRGGRRAERTX3GRGPPGRPRPXjP 
PGPARRGRRRMETP F YGDEALSGLGGGASGSGGTFAS PGRLFPG 
A?PTAAAGSMMKKDALTIjSLSEQVAAALKPAPAPASYPPA\ADG 
APS AAP PDGLLASPDLGLLKLAS PELERL 1 1 QSNGL VTTTPTS S 
QFLYPKVAASEEQEFAEGFVKALEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPGELAPAAAAPEAPVYA\NLSSY\AGGCRGli 
RGGAAT\VAFAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAOSRALERESEDPS*SPEHGSLASTASLLREQVAQLX 
QKVLSHVNSGCQLIiPQHQVPAY 



4254 



5920 



1381 



TS VQGDSQGTPTS S QG S INM E H W I S OA I HGS TTSTTS S S STQS G 
GSGAAHRIiADVMAQTHI ENHSAPPDVTTYTSEHSIQVERPQGST 
| GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 
PPS LE AALQR WGTI S P KAPCLTTMDTNGKPL Y I LTYG KLWTR S M 
KVAYSILHKLGTKQEPMVRPGDRVALVFPNNDPAAFMAAFYGCL 
LAEWPVPIEVPLTRKDAGSQQIGFLLGSCGVTVALTSDACHKG 
LPKSPK3BI PQFKGWPKLLWFVTES KHLSKP PRDWF\ PHIKDAN 
NDT AY I E YKTCK\ DGS VLGVT VTRTALLTHCQALTQACG YTEAE 
TIVNVLDFKKDVGLWHGILTSVMNMMHVISIPYSLMKVNPLSWI 
QKVCQYKAKVACVKSRDMHWALVAHRDQRDINLSSLRMLIVADG 
ANPWSISSCDAFLNVFQSKGLRQEVICPCASSPEALTVAIRRPT 
DDSNQ PPGRG VLSMHGLTYGVI RVDS EEKLS VLTVQDVGLVMPG 
AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
FEVFAMTSSGAP I SE YPFIRTGLLG FVGPGGLVF WGKMDGLMV 
VSGRRHNADDIVATALAVEPMKFVYRGRIAVFSVTVLHDERIVI 
VAEQRPDSTEEDSFQWMSRVLQAIDS IHQVGVYCLALVPANTLP 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PBIGPASVMVGNLVSGKRXAQASGRDLGQlEDNDQARKFIiFLSE 
VLQWRAQTTPDHILYTLLNCRGAIANSLTCVQIjHKRAEKIAVML 
MB RGHLQDGDHVAIiVYP PG I DL I AAFYGCLYAGCVP I T VR P PH P 
CNIATTLPTVKMIVEVSRSACLMTTQLICKLLRSREAAAAVDVR 
TWPLILDTDD*PKKRPAQICKPCNPDTLAYLDFSVSTTGMLAGV 
KMSHAATSAFCRS I KLQCELYPSRE VA I CLDPYCGLG FVLWCLC 
SVYSGHQSILIPPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 
CTKGLGS QTES LKARGLDLSRVRTC WVAEERPR I ALTQS FS KL 
FKDLGLHPRAVSTSFGCRVNLAICLQGTSGPDPTTVYVDMRALR 
HDRVRLVERGS PHS LPLME SGK IL PG VR 1 1 IANP ET KG p LG DSH 
U3E I WVHS AHNAS G Y FT I YG D E SLQ S DH FNSRLS FGDTQT I WAR 
TG YLGFLRRTE LTDANG BRHDAL YWGALDEAME LRGMR YH P I D 
IETSVIRAHKSVTECAVFTWTNLLWWELDGSEQKALDLVPLV 
TNWLEEHYLI VG WVWD IG VI P INSRGEKQRMHLR1X3 Fl»ADQ 
LDPIYVAYNM 



1499 



5921 



727 



~TsT 



QU3AVAHAGVSRIPP*LFPPLHPTFLSLWCLHHKLP/HPPGASM 
VRPP WP RR PPAH I SS VRQ AS TQ VP RT V PH TQRVANI GTQTTG P 
SGVGCCTPGRPLLPCKCSS'AAHSTYRVQEPAVHIPGQEPLTASM 
LAAAPIjH EQKQM I GE RL Y P L I HD VHTQLAG K I TGM LLE I DNS E L 
IiLMLBSPESLHAKIDEAVAVLQAHQAMEQPKAYMH 



5922 



2475 



V CPGTGGE *GLWGQLGGLPKE T PLK PMDAFTGSG LKR KFD D V D V 
GSSVSNSDDElSSSDSADSCDSIiNPPTTASFTPTSILKRQKQLR 
RKNVRFDQVTVYYFARRC^FTSVPSCGGSSLGMAQRHNSVRSYT 
LCEFAQEQEVNHREILREHLKEEKLHAKKMKLTKNGTVESVEAD 
GIjTLDDVSDEDI DVENVEVDDYF FLQPLPTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKCQVD 
RMS FPCGCSRDGCGNMAGR I EFNPIRVRTHYLHTIMKLELESKR 

Q\GAAQQPQ\*GALPDCQLQPDRSTGL*DPSWIGSKGLSFTGKG 
AAATHL 1 1 LR VI ENRGAEGKRK 



SYSNWGLFPSVFIQVPRSRTGNLKPIFI,FYSYYE\CMErLKG\T 



401 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segmenc containing signal peptide" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=»Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V«Valine, 
W=Tryptophan, Y-Tyrbsine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








clynatoyk^csprndrpdacynpsepaattvfeirtgiCTgdT- 

SKI I TRTEEKE I PKQI TLRFDACAAINS KKLEIGCGS LN *ERS* 
RVENKYVCHESGVCKNCAYWPCVI * AT*KKNKNDSVYLQKGEAN 
PS CAAGHCNPLELI ITN PLDPHWKKGER VTLGINRTGL KPQ VVI 
LI KGEVHKCS PKPVFQTFYEELNLPAPELLKKTKNLFLQLAENV 
I FLLNGTS C YVRGGTT IGDRWP WEA * ELVPTD PAPD 1 1 P I * KAE 

ASNF*VLKTSIIRQYCIAREGKDFIIPVGKPNCIGQKLYKSTTK 
TIT**DLNHTEKNPFSKFSKLKTA*AHAESH*DWTVPSGLY*IC 
RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKMGELLGFSVYASR 
E KKG I VI GNVJ KDN EW P RER 1 1 Q YYG P ATWAQDG S WG YR /TP / VY 
MLNWI I RLQAI LE 1 1 S N E TGRALT VLAWQE TQM RNA I YQNRLAL 
DYLLVAFX^VCRKFNLTNCCWJINDOGQVVKNIVRDMTKLAHVP 
IQVWHXFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 
P^ F Q M IKGIVATLVHQRTSAHVNYMNHYRSISQRDSKSEDESE 


5923 
~"§924 


137 


638 


gLCGRRGQRFRTSIKRMHPI*RTCPNTNL/ilLLSQENTQIRDL ' 
QQBNRELWISLEEHQDALELIMSKYRKQMLQLMVAKKAVDAEPV 
LKAHQ S HS AE I ESQ I DR I CEMGEVMRKAVQ VDDDQ FCK I Q E KLA 
QLELENKELRELLSISSESLQARKENSMDTASQAIK 




274 


2146 


EKGKVKDAGAEQWI SLSLSCKGS WETQFSNHLNS LTPPTS VRRM 
PLITTVTLLKMVARHHKKLLCSKAPSTQLCX3KIFLHSQMGIHHQ 
SVCMKLKPNTSHI ISILKGQPMALVQLETLAPLTI I IQKFQTQD 
HMKFWKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLXITKTIQNG 
RELFESSLCGDLLNRVQASE\Q*NQSIESRKEKRKKSNKKDSSR 
SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVLKEGSPSS 
ANTI FCSNNGS VHW\ FKFQVGDLVWS KVGTYFWWPCM VSS DPQL 
E VHTK I NTRGARE YH VQ FFSNQ PERAW VH EKR VR E YKGHKQ YEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
R EER I EQ YTF I Y I D KQPEE ALSQAKKS VAS KTBVKKTRRPRS VL 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAARKSLPASITMHKGSLDLQKCNMSPWKIEQVFALQNATG 
DGKFIDQFVYSTKGIGNKTEISVRGQDRLIISTPNQRNEKPTQS 
VSSPEATSGSTGSVEKKQQRRSIRTRSESEKSTEWPKKKIKKE 
QVGFLHVES 


5925 




1911 


mmtaesreatglspqaaqekdgivivkveeedeedhmwgqdstl 

QDTP P PDPE I FRQR FRR FCYQNTFGPR EALSRLKELCHQWLR PE 
INTKEQILELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 
DLELDLSGQQVPGQVHGPEMLARGMVPLDPVQESSSFDLHHEAT 
QSHFKHSSRKPRLLQSRALPAAHIPAPPHEGSPRDQAMASALFT 
ADSQAMVKI 3DMAVSL I LEEWGCQNLARRNLSRDNRQENYGS AF 
PQGGENRNENEESTSKAETSEDSASRGETTGRSQKEFGEFCRDQE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGERGPREKGK 
GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRHK 
IIHTGEKPYECSECGKAF\SLNS\NLVLHQRI\HTGEKPHECNE 
CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
RIHTGEKPYECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 
\AFTRSSTLTLHHRIHARERASEYSPASLDAFGAFLKSCV 


5926 
""5927 " 


2 ; 


233 


URCLMLKQGSyPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAIIEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 




4146 


1248 

• 

I 
1 


KHFS KFGSQALYQLKRPASGQNS IS VMPAQKITKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 

KRRLEFIEKEICKOXTVlT T CTJUIVB PflMyDnovt>nt cin t xr™ unnftrt 

WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAI FDQ 
MQQQRAEDNEAKWKRBIYGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMG I LQNLAAM YGGRPS SS RGGKPRNKEEEV 
YLARLRQIRLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
^KGVKSSDVSPPLGQHETGGSPSKQQMRSVISVTSALKEVGVDS 
5LTDTRETSEEMQKTNNAISSKREILRRLNBNLKAQEDEKGKQN 
l^DTFEINVHEDAKEHEKEKSVSSDRKKWEAGGQLVIPLDELTL 
>TS FSTTERHTVGEVI KliGPNGSPRRAWGKS PTDS VLK I LGE AE 
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1 corresponding 

j to first 
amino acid 
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location 
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Ammo acid segment containing signal peptiHT" 
lA=Alanine, Cysteine, D=Aspartic Acid, H= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V«Valine, 
WoTryptophan, Y=Tyrosine, X=UnJcAown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possibIe nucleotide insertion) 


5928 






LQLQTELIiENTTIRSElSPEGEKYKPLITGEKKVOCISHEINPS 
A I VDS P VE7KS P E FS EAS PQMS LKLEGNL EE PDDLETE I LQE PS 

GTNKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VS S TVDQLS D I H I EPGTNDSQHS KCDVD KS VQ P BP FFH KWHS E 

HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVIiKNSDVEPTANGTDVADEDDNPSSESAliNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHLEQSMGFEKFFBVYEKIKAIHE 
DEDENIEICSKI VCNI LGNEHQHLYAKI LHLVMADGAYQEDNDE 


5929 


j 4146 


1248 


KHFSKFGSQALYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISbMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQOQRAEDNE AKW KRE I YGRGL PE RQKGQLAVERAKQ VE E FLQR 
KREAMQNKARAEGHMGILQNIAAMYGGRPSSSRGGKPRNKEEEV 
xLARLRQIRLQNFNERQQI KAKLRG EKKEANHS EGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKS SDVS P PLGQHETGGS PS KQQMRSVI S VTS ALKB VG VDS 
SLTDTRETSBEMQKTNNAISSICREILRRLNENLKAQEDBKGKON 
LSDTFE INVHEDAKEHB KEKS VSSDRKKWEAGGQL VI PLDELTL 
DTSFSTTERHTVGEVIKLGPNGSPRRAWGKSPTDSVLKILGEAE 
LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AIVDSPVETKSPEFSEASPQMSI.KLEGNLEEPDDLETEILQEPS 
GTNKDE \ SLPCT ITD VW I SEEKETKETQS ADR I T I QENEVS EDG 

VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWIISE 
HLNLVPQ VQSVQCS PEES FAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNP KMLRTCSLPDLS KLFRTLMDVPTVGDVRQDNLE I DE I 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEOLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHDEQEMGFEKFFEVYEKIKAIHE 
JDSDEJ/IEICSKI VQN I LGNEHQHLYAKI LHLVMADGAYOEDNDE 


5930 


3 


1558 


bL)Kt>MTrQLPAYVAILLFYVSRASCQDTFTAAVyEHAAILPNAT 
LTPVSREEALALMNRNLDILEGAITSAADQGAHIIVTPEDAIYG 
WNFNRDSLYPYLEDIPDPEVNWIPCNNRNRFGQTPVQERLSCL\ 
AKNNSIYWANIGDKKPCDTSDPQCPPDGRYQYNTDWF\DSQG 
KLVARYHKQNLFMGENQFNVPKEPEIVTFNTTFGSFGIFTCFDI 
LFHDPA\TTLVKDFHVDTIVFPTAWMNVLPHLSAVEFHSAWAMGM 
RVNFLASN I HYPS KKMTGSG I YAPNSSRAFHYDMKTEEGKLLLS 
QLDSHPSHS A WNWTS YASS I E ALS SGNKE FKGT V FFDE FTFVK 
LTGVAGNYTVCQKDLCCHLSYKMSENIPNEVYALGAFDGLHTVE 
GRYYLQICTLLKCKTTNLNTCGDSAETASTRFEMFSLSGTFGTQ 
Y VF PEVLLS ENQLAPGE FQ VSTDGRLFS LK P TSGP VLTVTL FG R 
LYEKDWASNA5SGL?AQARIIMLIVIAPIVCSLSW 




113 


6082 

( 
J 
f 
I 
C 
I 
* 
I 


KGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK" 
K LVWI PS ERHGFEAAS I KEERGDEVMVE LAENGKKAMVNK DD I Q 

KMNPPKFSKVEDMAELTCLNEASVLHWLiCDRYYSGLlYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQSILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
I PG E \ LERQLLQANP I L ES FGNART VQNDNS SRFGKFIRI NFDV 

rGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAI4HIKG 
FS H E E I LSML KWS S VLQ FGN I S FK K ERNTDQAS M P ENTVAQ KL 
ZHLLGMNVME FTRA I LT P R I KVGRD YVQKAQTKEQAD FAVEALA 
<ATYERLFR WLVHR I NKALDRTKRQG AS F I G I LD I AG FE I FELN 
3FEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
)LQPCIDLIERPANPPGVIALLDEECWFPKATDKTFVEKLVQEQ 
3SHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
rVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
CKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 
iDPHLVLDQLRCNGVLBGIRJCRQGFPNRIVFQEFRQRYEILTP 



403 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
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1 nucleotide 

I location 

1 corresponding 
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amino acid 

1 residue of 

I amino acid 

1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


" segment containing signal peptide 
{A^Alanine, C=Cysteine, D=Aspartic Acid, E= I 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, NVAsparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T-Threonine, V=Valin e/ 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 1 
Codon, /^possible nucleotide deletion, 
\=po3sible nucleotide insertion) j 


5931 1 






WAIPKGFMIX5KQACBRMIRALHLDP^YRIGQSklFFRAGVZiAH I 
UiKERDLK I TDI I 1 FFQAVCRGYLARKAFAKKQQQLS ALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 

BKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETBLFAEAEEM 

I^LAAKKQELEBILHDLSSRVEEEEERNQILONEKKKMQAHIQ 

DLE EQLDE EEGARQ KLQLEKVTAEAKI K KMEEEI LLLEDQNS KF 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAK1RNKQBVMISDLEE 

RLKKEEICTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 

AKKEEEU3GALARGDDETLHKNNALXWREL0AQIABLQEDFES 

EKASRJJKAEKQKRnLSEELEALKTElEDTLDTTAAQQELRTKRE 

CEVAELKKALEEBTKNHEAQIQDMRQRHATALEELSEQLEQAKR 

FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 

QE LKAKVS EGDRLR VE LAEKAS KLQNELDNVS TL LEE AE KKG I K 

FAKDAASLESQLQDTQEliliQEETRQKLNLSSRIRQLEEEKNSLQ 

EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA ' 

KKKLLKDAEALSQRLEBKALAYDKLEXTKNRLQQELDDLTVDLD 

HQRQVASNLEKKQ\KKFDQLLAEEKSISARYAEERDRAEAEARE 

KETKALSLARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 

KNVHELEKSKRALEQQVNEEMRTQLEELEDELQATEDAKLRLEV 

fJfiyAMKAgfbKDLQTRDEQNEEKKRLLI KQVRELEAELEDERKQ 1 

RALAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 

DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 

ERARRHAEQERDELADEITNSASGKSALLDEKRRLSARIAQLEE 

ELEEBQSNMELLNDRFRKTTLQVDTLNAELAAERSAAQKSDNAR 

QQLE RQNKELKAKLQELEGA VKS KFKAT I SALEAK I GQLEE QLE 

Q EA KERAAANKXVRRTE KKLKEI FMQVE D ERRHADQ YKEQM E KA 

NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANBGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE J 




113 


6082 

] 
< 

I 


RGNCFW I VP FTMAQRTGLEDFERYLF VDRAVI YNPATQADT?TAK 
KLVWIPSERHGFEAASIKEERGDEVMVELAENGKKAMVNKDDIQ 
KMNPPKFSKVEDMAELTCLNEASVLKNLKDRYYSGLIYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQSILCTGESGAGKTENTKKVIQYIAHVASSHKGRKDHN 
IPGE\LERQLLQANPILESFGNARTVQNDNSSRFGKFIRINFDV 
TGYI VGAN I ETYLLEKSRAVRQAKDERTFH I FYQLLSGYAGEHIi 
KSDLLLEGFNNYRFLSNGYIPI PGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHIJLGMNV^FTRAILTPRIKVGRDYVQKAQTKEQADFAVEAliA 
KAT YERLFRWLVHR INKALDRTKRQGAS FIG I LD I AGFE I FELN 

SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKAD2WLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATIiRNTNPNFVRCI IPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 
NAIPKGFMDGKQACBRMIRALBLDPNLYRIGQSKIFFRAGVLAH 
LBEERDLKITDIIIPFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLFOuRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 
EKQTKVEGELEEMBRKIIQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAKKQELE E I LHDL-ES RVEEEE ERNQ I LQNEKKKMQAH IQ 

DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 
I KB KKLMEDR I AE CS S QLAE EE E KAKNLAKIRNKQEVM 1 S DL EE 
RLKKE E XTRQE LE KAKR KLDGETTDLQDQ I AE LQAQ IDELKLQL 
^KKEEELQGAIiAJlGDDBTIJiKNNALKVVRELQAQIAELQEDFES 
SKAS RNKAE KQ KRDLS E E LEALKTEL EDTLDTTAAQQELRTKRE 
2E VAELXKALE E E TKNHEAQ I QDMRQRHATAL E ELSEQLEQAKR 
^KANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 
3ELHAKVSEGDRLRVELAEKASKLQNEI1DNVSTLLEEAEKKGIK 
r AKDAASLESQLQDTQELLOEETRQKLNLSSRIRQLEEEKNSLQ 
'00 B E EE EAR KNLE KQ VUUjQSQLADTKKKVDDDLGTI ES LE EA 
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ID 

KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K»Lysine, 
L=Leucine, M-Mcthionine, N^Asparagine , 
P-Proline # Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine , V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKAIAYDKIaEKTKNRUXJELDDLTVDLD 
HQRQVASNLEKKQ\KKPDQl»LAEEKSISARYAEERDRAEAEARE 
KETKAiiSIiARALEBALBAKEE FERQNKQLRADMEDLMSS KDDVG 
KNVHELEKSKRALEOOV\EEMRTOr.FKT.PnPT^aTtrnnirr dt i?\r 

NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELEAELEDERKQ 
RALAVAS KKKME IOLKDLEAQ I EAANKARDE VIKQLRKLQAQMK 
DYQRELE SARAS RDE I FAQS KESEKKLKSLEAEILQLQE3LASS 
ERARRHAEQERDELADEITNSASGKSALLDEKRRLEARIAQLBE 
E L EE EOS NME LLND RFR PCTTLO VDT r .N AP T .a a. p o c n a n v c nxnv d 

QQLERQNKELKAKLQELEGAVKSKFKATISALEAKlGQIiEEQLE 
QEAKERAAANKLVRRTE KKLKE I FMQVEDERRHADQYKEQMEKA 
KARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 




RHLEEICFLFliQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FGATLAVGLTIPVLS WTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQ Y P P P Y P AO. PMGP PA YH BTLAGG AAA P Y PASQ PP YN PAYMDA 
PKAAL 


" 5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRS^DWSSGSSDAriMl5ASePgD~" 

SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 

AFS IGKMSTAKRTLS KKEQEELKKKEDEKAAAE I YEEFLAAFEG 

SDGN KVKTF VRGG WN AAKEEKETDE KRG KI Y K PSS R FADQKN P 

PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLEtiFKEELKQI 

QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 

DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCCEFGRFGP 

LASVKIMWPRTDEERARERNCX5FVAFMNRRDAERALKNLNGKMI 

MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 

RERLKNPNAPMbPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 

HRMIEFVVREGPMFEAMIMNREINNPMFRFiFENQrPAHVYYRW 

iaYS I LQGDS PTKWRTBD FRMFKNGS FWRP P PLN PYLHGMS EEQ 

ETEAFVEEPSKKGALKEEQRDKLBEILRGLTPRKNDIGDAMVFC 

LNNAEAAEEIVDCITESLSHiKTPLPKKIARLYLVSDVLYNSSA 

KVANASYYRKFFETKTjCrtT F^riT.NATYPTTnr'UT nctTKTi7Trr»mnui 

rCFRAWEDWAIYPEPFIiIKLQNIFLGLVNIIEEKETEDVPDDLD 
GAP I E EE LDG APLE DVDG 1 P I DAT P I DDLDGVP I KS LDODLDG V 
P LDATE DS KKN E P I FKVAP S KWEAVDES ELEAQAVTTS KWE LFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHKLYSNPIKEEMTE 
S KFS KYS EMSEEKRAKLRE I ELKVMKFQDELESGKRPKKPGQS F 
QEQVBHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DBCTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKS RSQSRS PHRSHKKS KGKTNTGRKFFKKAVT YWKCDL F 
LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELfCKKBDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGN1\NPQMNLKKCCCQEFGRFGP 
IJ^VKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMHEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKl^DFEKTLSQAIVKWIPTERNUALI 
HRMIEFVVREGPMFEAMIMNREINNPMFRFLFEWQTPAHVYYRW 
KLYSIIiQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNniGDAMVFC 
LNNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSEWFKQRVM 
TCFRAWEDWAI YPEPFLI KLQNI FLGLVNI I EEXETEDVPDDLD 
GAP I EEJ3 LDG APLED VDG I P I DATPI DDLDGVPI KSLDDDLDG V 
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Predicted 
beginning 
nucleotide 
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correspondi ng 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corre sp ondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
HnHistidine, I=Isoleucine, K» Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATE DS KKNE P I F KVAPSKW EAVDES E LEAQAVT7S KWEL FD 
QHEESEEEENQNQEEESEDSEDTQSSKSEEHHLYSNPIKEEKTE 
SKFSKYSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKErCLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


593S 


3 


4493 


SYWLSGWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 

SDSGSFVSSRARREKKSKKGRQEALERLKKAKAGERYKYEVEDF 

TGVYEEVDEEQYSKLVQARQDDDWI VDDDGIGYVEDGRE I FDDD 

LEDDALDADE KGKDG KARNKD KRNVKKLAVTKPNN I KS M F I ACA 

GKKTADKAVDLSKDGLLGDI IiQDLNTETPQ ITPPP VM I LKKKRS 

IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPIiKRAEFAG 

DDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWD 

KESEPAEEVKQSACSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 

VQEVQVDSSHLPLVKGADEEQVFHFYWIiDAYEDQYNQPGWFLF 

GKVWIESAETHVSCCVMVKNIERTLYFLPREMKIDLNTGKETGT 

PISMKDVYEBFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKS 

EYLEVKYSAEMPQLPQDLKGETFSHVFGTNTSSLELFLMNRKIK 

GPCWLEVKKSTALNQPVSWCKVBAMALKPDLVNVIKDVSPPPLV 

VMAFSMKTMQNAKNHQNBI IAMAALVKHSFALDKAAPKPPFQSH 

FCWS KPKDC I FPYAFKEVI EKKNVKVEVAATERTLLGFFLAKV 

HKIDPDIIVGHNIYGFELEVLLQRINVCKAPHMSKIGRLKRSNM 

P KLGGRSG FGE RNATCGRM I CDVE I S AKEL IRCKSYHLS EL VQQ 

ILKTERWIPMENIQNMYSESSQLLYLLEHTWKDA\KFILQIMC 

ELNVLPLALQ I TN I AGN I MS RTLMGGRS ERNB FLLLHAF YENN Y 

I VPDKQ I FRKPQQKLGDEDEE IDGDTNKYKKGRKKGAYAGG LVL 

DPKVGFYDKFI LLLDFNSLYPSI IQEFNICFTTVQRVASBAQKV 

TEDGEQ3QIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 

LNPDLILQYDIRQKALKLTANSMYGCLGFSYSRFYAKPLAALVT 

YKGRE ILMHTKEMVQKMNLE VI YGDTDS IMINTNSTNLEBVFKL 

GNKVKSEVNKLYKLLEIDIDGVFKSLLLLKKKKYAALVVEPTSD 

GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGQILSDQSRDTIV 

ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 

HVHVALWINSQGGRKVKAGDTVSYVICQDGSNLTASQRAYAPEQ 

LQKQDNLTIDTQYYLAQQIHPWARICEPIDGIDAVLIATGWEL 

\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 

TCGT EN I YDNVFDGS GTDM E PSLYRCS N I DCKAS PLT FTVQLSN 

KLIMDI RRFI KKYYDGWLICEEPTCRNRTRHLPLQFSRTGPLCP 

ACMKATLQPEYS DKSLYTQLCFYRYI FDAECALEKLTTDHE KDK 

LKKQFFTPKVLQDYRKLKNTAEQFLSRSG YSEVNLS KLFAGCAV 

KS 


5936 


1124 


139 


RGEEQFDAE FRR F ACLGFGERLQ EFS RLLRAVH RS RAWTCY LA I 
RMLMATCCPSPTTTACTGPWQRAPPLRLLVQKREADSSGLAFAS 
NSLQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPLGFYI RDGMSVRVAPQG \ LERVPG I FI 
S RLVRGG LAESTGLLAVS DE I LE VNG I EVAG KTLNQ VTDMMVAN 
SHN \L I VTVKPANQRNNWRG ASGRLTGP PS AGPG PAE PDS DDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


by3 / 


31 


1600 


PTSLLKSTVQLMCRLLQDKRYQCV YSLAE I FKVLASF YVILVI L "~ 
YGLTSSYSLWWMLRSSLKQYSFEALREKSNYSDIPDVKNDFAFI 
LHLADQYDPLYSKRFSIFLSEVSENKLKQINLNNEWTVEKLKSK 
LVKNAQDKI ELHLFMLNGLPDNVFELTEMEVLS LELI PE VKLPS 
AVSQLVNL KE LR VYHS S L WDHPALAFLE EN LKILRLKFTEMG K 
IPRWVFHLKNLPCELYLSGCVLPEQLSTMQLEGFQDLKNLRTLYL 
KSSLSRIPQVVTDLLPSLQKLSLDNEGSKLVVLNNLKKMVNLKS 
LELI SCDLER I PHS I FS LNNLHELDLRENNLKT VEEI IS FQHLQ 
NLSCLKLWHNN I AY I P AQ IGAL SNLEQLS LDHNNI ENLPLQLFL 
CTKLHYLDLS YNHLTFI PEEIQYL\SNLQYFAVTNNNI EMLPDG 
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ID 
NO; 



Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



I Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, 

I Glutamic Acid, Phenylalanine, G=Glycine, 

i H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N*=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y»Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

I \=possible nucleotide insertion) 



5938 



395 



1865 



LFQCKKLiQCIiIjLGKNSLMNLS PHVGELSNLTHREPIG \N YLETL ' 
PPELEGCQSLKRNCLIVEENLLNTLPLPVTERU?TCLDKC 



5939 



66 



YKGEGFFCi'iyEARGERRKKKKAMSSPNIWSTGSSVYSTPVPSQK 
MTVWILLLLSLyPGFTSQKSDDDYBDVASNKTWVLTPKVPEGDV 
TVILNNLLEGYDNKIiRPDIGVKPTLIHTDMYVNSlGPVNAINME 
YTIDIFFAQTWYDRRLKFNSTIKVbRLNSNMVGKIWI PDTFFRN 
SKKADAHWITTPNRMLRIWNDGRVLYSLRLTrDAECQI^LHNFP 
MDEHS CPLE FSS YG YPR EEI V YQ WKRSS VEVGDTRS WR LYQFSF 
VGLRNTTEWKTTSGDYWMSVYFDLSRRMGYFTIQTYIPCTLI 
WLS MVS F W I N KDAV PARTS LG 1 TT VLTM TT LST I AR KS L P KVS 

YVTAMDLFVSVCFIFVFSALVEYG\TLIIYFVSNRKPSICDKDKKK 

KNPAPTIDIRPRSATIQMNNATHLQERDEBYGYECLDGKDCASF 

FCCFEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNbVYW 
VSYLYL 



14 04 



5940 



145 



IRPGYLKEVQENSPGHRAGLEPFFDFIVS1NGSRLNKDNDTLKD 
LLKANVEKPVKMLI YS S KTLELRETS VTPSNLWGGQGLLGVS IR 
FCSFIX;ANENViraVLEVESNSPAAIJVGIJ?PHSDYIIGADTVMNE 
S EDL FS L I E TH EAKP LKLY V YNTDTDNCRE V 1 1 TPNSAWGGEGS 
LGCGIGYGYLHRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTE 
VQI^SVNPPSLSPPGTTGIEQSLTGLSISSTP\PAVSSVLSTGV 
P TVP \ LLP PQVNQSLTS VP PMES S YLHLPG LNP FTRQG L PNL PQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 

TTAKADAASSLTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAA 
VDANASESP 



RRSASRSAS PRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLP VHMGLV I T EVEQE PS FSD I AS L WWCMAVG IS Y I S VYDH 
QGIFKRNNSRLMDEILKQQQEIiLGLDCSKYSPEFANSNDKDDQV 

LNCHLAVKVLSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\ VYLVQMWL I L I 



6147 



ML.IiGHMGASSPRSPEPVGPPAPGLPFCCGGSLLA VWLLALPVA 

WGQCNA?EW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 

SIICLKNSVWTGAKDRCRRKSCRNPPDPVNGKVHVIKGIQFGSQ 

IKYSCTKGYRLIGSSSATCIISGDTVIWDNETPICDRIPCGLPP 

TITNGDFISTNRENFHYGSWTYRCNPGSGGRKVFELVGEPSIY 

CTSNDDQVGIWSGPAPCCIIPNKCTPPNVENGILVSDNRSLFSL 

NEWEFRCQPGFVMKGPRRVKCQALNKWEPEIiPSCSRVCQPPPD 

VLHAERTQRDKDNFSPGQEVFYSCEPGYDLRGAASMRCTPQGDW 

SPAAPTCEVKSCDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEGF 

QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 

KPLEVFPPGKAVNYTCDPHPDRGTSFDLIGESTIRCTSDPQGNG 

VWSSPAPRCG1LGHCQAPDHFLFAKLKTQTNASDFPIGTSLKYE 

CRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKTPPDPVNGMVH 

VITDIQVGSRINYSCTTGHRLIGHSSAECILSGKAAHWSTKPPX 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FELVGEPSIYCTSNDDQVGIWSGPAPQCI IPNKCTPPNVENGIL 

VSDNRS LFS LNE WE FRCQPG FVM KG PRR VKCQ ALNKWE PE L PS 

CSRVCQPPPDVLHAERTQRDKDNFSPGQEVFYSCBPGYDLRGAA 
SMRCTPQGDWSPAAPTCEVKSCDDFMGQLLNGRVLFPV^LQLGA 
KVDFVCDEGFQLKGSSASYCVLAGMESIiWNSSVPVCEQlFCPSP 
PVIPNGRHTGKFLEVFPFGKAVNYTCDPHPDRGTSFDLIGESTI 
RCTSDPQGNGVWSSPAPRCGILGHCQAPDHFLFARLKTQTNASD 
FPIGTSLKYECRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 
PPD P VNGM VHVI TD 2 QVGSR INYS CTTGHRL IGHSS AE C I LSGN 

TAHWSTKPPICQRIPCGLPPTIANGDFISTNRZNFHYGSWTYR 
CNLGSRGRKVFELVGEPSI YCTSNDDQVG 1WSGPAPQCI I PNKC 
TP PNVEKG ILVSDNRS LFSLNE WEFRCQPG FVMKG PRRVKCQA 
LNKWEPBLPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEV7YSC 
EPGYDLRGAASLHCTPQGDWSPEAPRCAVKSCDDFLGQLPHGRV 
LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLiVGMRSLWNNSVP 
VCEHIFCPNPPAILNGRHTGTPSGDIPYGKEISYTCDPHPDRGM 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino, acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, ^Phenylalanine, Glycine, 
H=Hxstidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
"-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5942 






TFNLIGESTIRCTSDPHGNGVWSSPAPRCElLiVRAGHCKTPEQP 
P FAS PTI P I ND FE F PVGTS LNYECRPG YFG KM PS IS CLENLVWS 

SVEDNCRRK^CGPPPEPFKGN^Ifm>TQFGSTVNYSCNEGFRL 
IGSPSTTCLVSGNNVTWDXKAPICEIISCBPPPTISNGDFYSNN 
RTS FHHGTWTYQCHTG PDGEQL FELVG ERS I YCTS KDDQVGVW 
SSPPPRCISTNKCTAPBVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGSHTVQCQTNGRV/GPKLPHCSRVCQPPPEILHGEHTLSHQ 
DNFSPGQEVFYSCEPSYDbRGAASLHCTPQGDWSPEAPRCTVKS 
CD D FI/3 QL P HG R VLLPLNLQ LG AKVS FVCDEG FRLKGRS AS HC V 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VS YTCD PH PDRGMTFNL I G EST I R RTS EPHGNG VWS S P APRCEL 

PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDFQYLLVGK 
GFIFCTDQGIWSQLDHYCKEVNCSFPLFMNGISKBLEMKKVYHY 
GDYVTLKCEDGYTLEGSPWSCX^ADDRWDPPLAKCTSRTHDALI 

VGTLSGTIFFILLIIFLSWIILKHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


S943 


4509 


688 


vlyvrmranpiaVgishkayqidppl\rkhreq\Lvie\vgrkl 

DK\ AQM I RFBERTG YFSSTDLGRTASHYYIKYNTI ETFNELFDA 
HKTEGDIFAIVSKAEEFDQIKVREEEXEELDTLLSNFCELSTPG 
G VENS YG KIN I LLQT YI NRG EMDS FS L I S DS AYVAQN AAR I VRA 

LFEIALRKRWPTMTYRLLNLSKAIDKRLWGWASPLROFSILPPH 

MLTRLEEKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 

S VMM EAF I QP I TRTVLR VTLS I YAD FTNN DQVHGTVGE P W W I WV 

EDPTNDHI YHSEYFtiALKKQVISKEAQLLVFTI PI FEPLPSQYY 

IRAVSDRWLGAEAVCIINFQHLILPERHPPHTELLDLQPLPITA 

LGCKAYEAL YN FS H FNP VQ TQ I FHT L YHTDCNVL LG APTG S G KT 

VAAELAI FRVFNKYPTSKAVYIAPLKAIiVRERMDDWKVRI EEKL 

G KKVIELTGDVTPDMKS I AKADLI VTTPEKWCG VS RS WQNRNYV 

QQVTIJjIIDEIHLLGEERGPVLBVIVSRTNFISSHTEKPVRIVG 

LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 

H YCPRMASMNKPAFQAIRSHS PAKP VLI FVS SRRQTRLTALELI 

A?LATEEDPKQWLNMDEREMENIIATVRDSNLKLTLAFGIGMHH 

AGLHERDRKTVEELFVNCKVQVLIATSTLAMGVNFPAHLVIIKG 

TEYYIX5KTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 

KKDFYKKFLYEPFPVESSLLGVLSDHLNAEIAGGTITSKQDALD 

YITWTYFFRRLIMNPSYYNLGDVSHDSVNKFLSHLIEKSLIEIiE 

LS YCIE 1GEDNRSI EPLTYGRIAS YY YLKHQTVKMFKDRLKPEC 

STEELLS I LSDAEEYTDLP VRHNEDHMNSELAKCLP I ESNPHS F 

DSPHTKAHLLLQAHLSRAMLPCPDYDTDTKTVLIX3ALRVCQAML 

DVAANQGWLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENHHL 

HLFKKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVES 

ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNELSVST 

LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 

PRFPKSKDEGWFLILGEVDKRELIALKRVGYIRNHKVASLSFYT 

PEIPGRYIYTIiYFMSDCYLGLDQQYD/NLSQRYTSESFCTGQHQ 
Oh 




1 - 


2274 

- 

1 

] 
1 


UKPTRHKTYLbSSWAKMAAAEGPVGDGELWQTWLPNHVVFLRLR 
EGLKNQS PTEAEKPASSSLPSS PPPQLLTRNWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LSPTQHHVALIGIKGLMVLELPKRWGKNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHVVLLTSDNVIRIYSLR 
cry i r i w v 1 1 LibbAKiBSLVLNKGRAYTAS LGETAVAFDFGPLA 
AVPKTLFGQNGKDEWAYPLYILYENGETFLTYI SLLHS PGN / 1 
WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPNILVIATESGML 
yHCVVLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVELELALKL 
^SGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
^KLHKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLPCRQPAP 
rRGFWrVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR 
3DVE VAES PJjRVLAET PDS FEKH I RS I LQRS VANP A FL KAS EKD 
C APPPE ECLQL LS RATQ VFREQ Y I LKQDLAKE E I QRR VKLLCDQ 
CKKQLEDLSYCREERKSLREMAERLADKYEEAKEKQEDIMNRMK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=>Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








KLLHSFHSELPVLSDSERDf'lXKELQLlPDQLRHIiGNAIKQVTMK 
KDYQQQKMEKVLSLPKPT 1 1 LSAYQRKCIQS I LKEEGEHI REMV 
KQINDIRNHVNF 


5944 


167 


3428 


raiATFTDEPEVLTEPPSATTTTTIGISATWTTLAGSHGKRNNT 

ITTTSSKRKNRKNKITPEWVQIIFDDPLPISYSQPEKVNGESKS 

SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAVVTTTVSS 

fCKQPSVLVTFPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 

GPSPLSSPNGKL7VASPKRGQKREEGWKEWRRSKKVSVPSTVI 

SR VI GRGGCN I NA I R E FTG AH I DI D KQKDKTGDR 1 1 TI RGGTE5 

TRQATQLINALIKDPDKBIDELIPKNRLKSSSANSKIGSSAPTT 

TAANT S LMGI KMTTVALS S TS QTATALT V PAI S S AS THKTI KN P 

VNNNVRPGFPVSFPXiAYPPPQFAHALIiAAQTFGQIRPPRLPMT 

HFGGTFPPAQSTVJGPFPVRPLSPARATNSPKPHMVPRRSNQNSS 

GSQVNSAGSLTSSPTTrTSSSASTVPGTSTNGSPSSPSVRRQLF 

VTWKTS NATTTTVTTTASNNNTA PTNATY PM PTAKE HY P VS S ? 

SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 

TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 

EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 

PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 

LPSTLSTQ3ACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 

SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 

FRPPLQRPAPSPSGIVNMDSPYGSVTPSSTHLGNFASNISGGQM 

YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSS 

GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 

TSAPSVIGSNLSTSVGHSGI WS FEGIGGNQDKVDWCNPGMGNPM 

IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 

VGGMPFSVYGNAMIPPVAPIPDGAGGPIFNGPHAADPSWNSL1K 

MVS SSTENNG PQTVWTG PWA PHMNSVH MNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRNG IAEDLKGQADFF FliLVS EAWATGS PRA 
WLTCLI L PI* PGI I FS VL PKAMS RP LL I T FTPATD PS DLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRIfcRAA 
QEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQG 
AAVSYLLGRGAAVA/GVCELSGRDAAQLAEEAGFPEVARMVRESH 
GETRS P ENR S PTPS LQ YC ENCDTH FQDSNHRTS TAKL LS LSQG P 
QPPNLPI/jVPISSPGFKLLLRGGWEPoMGLGPRGEGRANPIPTV 
LKRDQEGLG YRS A PQ PRVTH FP AWDTKAVAGRE \ TP P RVATLS W 
RE ERRREE \ KDRAWERDLRTYMNLEF 


5946 


541 " ■ 


l*6(i 


ILGSYSSIQPEEYS\SWC\EWLQDLIA\YVSPK\HSYLRDLP 
SEGSPQRVNS IDFV\EL\EHLQPDVLVIIAVLRWDF/TI LTEAV 
YSYRGQKQKKVMt,TVEQAQDQHYAIiVLWGPGAAW\YPQLQRKKG 

yiwefkylfvqcnytlenlelhttpwssceclfdddiraitfka 

KFQKSAPSFVKISDLATHLEDKCSGWLIKAQISELAFPITASQ 
KIALNAHSSLKSI FSSLPNIVYTGCAKCGLEI.ETDENRI YKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVIVPSSEITYGMVVADLFHSLLAVSAEPCVLKIQSLFVL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 




3 


1317 


RG I PDRRRRG P I GR VNMDLENK VKKMGLGHEQG FGAPCLKC KE K 
CEGFELHFWRKICRNC\NVAKKSM/TVLLSNEEDRKVGKLF3DT 
KYTTLI AKLKSDGI PMYKRNVM ILTNPVAAKKNVS INT VTYEWA 
P PVQNQALARQYMQML PKE KQ PVAGS EG AQ YR KKQLAKQ L PAHD 
\ju v j> ft.LMt, iji> PKt VKJiMEQ rVKKYKS EALG VG D VKLPC EM DAQG 
PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
DPAI YAERAG YDKLWHPACFVCSTCHELLVDM I YFWKNE KL YCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKP VC K P CYVKNHAWCQG CHNAI D P EVQRVT YN 
NPSWHASTECFLCSCCSKCLIGQKFMPVEGMVFCSVECKKRMS 


5948 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSJULPEGFFSGPIDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSViCHSPTRETLrYAQAQRM 
VE I E I EGRLHR ISIFDPLEII LE DDLTAQEMS E CNSNKENSE R P 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
| nucleotide 

location 

corresponding 

to first 

amino acid 
1 residue of 
I amino acid 
! sequence 



Amino acid segment containing sagnal peptide" 
<A=Alanine, C=Cysteine, D=Aspartic Acid, K« 
Glutamic Acid, F=Phenylalanine, G=Glycine 
H=Hxstidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Nethionine, N=Asparagine 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, * = Stop 
Codon, /=possible nucleotide deletion 
\=posslble nucleotide insertion) 

SPPSAPRRPPVyyKFIEKSAEE LDNBVBYDMDEEDYAWLEIVNE 



39 



3370 



or^oMrruw^viiAJriEKSAJBELDNEVBYDMiJEEDYAWLEIVNE 
KRKGDCVPAVS QSM FE FLMD RFE KES HCENQKQGEQQ S L I DB DA 
VCCIOMDGECQNSNVILFCDMCNIiAVHQECYGVPYI PEGQWIjC / 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ I P 
E\VGFANTVFIEPIDG\^IPPARW?CLT\C2JLCKBKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYM KMEPVKELTGGGTTFS VRKTA 
YCDVHTPPGCTRRPLNIYGDVBMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKE KLK YWQRLRH DLERARLL I ELLRKREKLKREQVKVEQVA 
MELR LTPLTVLL R 3 VLDQLQDKD PAR I FAQP VS LKE VPDYLDH I 
KHPMDFATMRKRLEAQGYKNLHEFEEDFDLIIDNCMKYNARDTV 
FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAf>IKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARSBQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSEL I S CI ENGN YAKAAR T AAEV 
GQSSMWISTDAAASVLBPLKWWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WL P KS KMVPLG IDETI DKLKMMEGRNS SIR KAVR I AFDRAMNHL 

srvhgeptsdlsdid 

yrerypvsggsvlrsaleVcW dflsgltegsllpegffsgpidq 

GNHYQMRRKGRCHRGSAARHPSSPCSVXHSPTRETLTYAQAQRM 

veihiegrlhrisifdpleiileddltaqemsecnsnkenserp 

PVCLRTKRHKNNRVTf RIf KTRaT.DO jv wntt a o ~ ^ » T «t-,~. 



5950 



1166 



373 



5951 



143 



-,*^^rrv i * ^ r xci^AJS^bUNEVEYDMDEEDYAWLEI VNE 

krkgdcvpavsqsmfeflmdrfekeshcenqkqgeqqslideda 

VCXTICMDGEOONSNVIIjFCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\VCALW\IP 
EWGPANTVFIEPIDGVRNIPPARWFCLTNCNLCKEKGR/VGACI 

qchkancy^afhvtcaqkaglymkmbpvkeltgggttfsvrkta 
ycdvhtppgctrrplniygdvemkngvcrkessvktvrstskvr 

KKAKKAKKALAE PCAVL PTVCAP Y I P PQRLNRI ANQ VAI QR KKQ 
FVERAHSYWLLKRL3RNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 

melrltpltvllrsvldqlqdkdparifaqpvslkevpdyldhi 
khpmdfatmrkrlbaqgyknlhefeedfdliidncmkynardtv 
fyraavrlrdqggwlrqarrevdsigleeasgmhlperpaaap 

RRPFSWEDVDRXiDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 

srskrakllkkeiallrnklsqqhsqplptgpglegfeedgaal 
gpeageevlprletllqprkrsrstcgdseveeespgkrldagl 

TNGFGGAR8EQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSELISCIENGNYAKAARIAAEV 

gqssmwistdaaasvleplkwvjakcsgypsypaliidpkmprv 

pghhngvtipappldvlkigehmqtksdeklflvlffdnkrswq 

wlpkskmvplgidetidklkmmegrnssirkavriafdramnhl 
srvhgeptsdlsd id 

esrsltmstsqpgacpcqgaasrpailyallssslkavpr^rsr 

CLCRQHR P VQLCAPHRTCRE ALD V LA KTVA FLRNLPSFWQL PPQ 
DORRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSLAAVQWLQCCLESFWSLELSPKE\YACL 
KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKS I PTS LLGDLFFRP I IGDVDI AGLL GDMLLLR 
WNVKPSLLVVQI t FifircntfT7Ptjr'r\Mf^e-'TP i^t/n»^.c» > Tw.r T ,T,.^^ ■ = - 



■ ~-»*«-»^* "o-j x tr x o uijijuutrt kf i i,UiJVDX AGLLGDMLLLR 
WNVKPSLLWy^FKFSDKEEHEQNDSISGKTGETGVEEMlATRlT 
VEQDSKETVKLSHEDDHILEDAGSSDISSDAACTNPNKTENSLV 
GLPSCVDEVTECNLELKDTMGIADKTENTLERNKIEPrCYCEDA 



QLNAIESTKIESHETANLQDDRNSQSSSVSYLESKSVKSKHTKP 
VIHS KQNMTTDAPKKI VAAKYE VIHSKTKVNVKS VKRNTDVPES 
QQNFHRPVXVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKKTLQ 
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SEQ 
ID 
NO: 


tr reuicceo 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide I 
(A=Alanine, OCysteine; DsAspartic Acid, Es 
Glutamic Acid, F^Phenylalanine, G»Glycine, / 
H=Histidine, I^Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N-Asparagine, j 
P=Proline, Q-Clutamine, R^Arginine, ' j 
S=Serine, T=Threonine, V= Valine, ! 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop ! 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ' 1 


5952 






DQTLVQIFKPLtHSLSDKSHAHPGCLKEPHHPAQTCHVSHSSQK j 
QCHKPQQQAPAMKTNSHVKBELEHPGVEHFKEEDKLKLKKPEKN 
LQPRQRRSSKSPSLDEPPLFIPDNIATIRREGSDHSSSPESKYM 
WTPS KQ OG F CKKPHGN R FM VGCG RCDDW FHGDCVGLS LSQAQQM 

GEEDKEYVCVKCCAEEDKKTEILDPDTLENQATVEFHSGDKTME 
CEKLGLSKHTTNDRTKYIDDTVKHKVKILKRESGEGRNSSDCRD 
NEIKKWOLAPLRKMGQPVLPRRSSEEKSEKIPKEST-^VTCTGEK 
ASKPGTHEKQEMKKKKV\EKGVLNVHPAASASKPSADQIRQSVR 
HSLKDILMKRLTDSNLKVPEEKAAKVATKIEKELFS FFRDTDAK 
YKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMSPEELAS 
KELAAWRRRENRHTIEMIEKEQREVERRPITXITHKGEIEIESD 
APMKEQEAAMEIQBPAANKSLEKPEGSEKNRKEEVDSMSKDTTS 
QHRQHLFDLNCK1CIGRMAPPVDDLSPKKVKVWGVARKHSDNB 
AESIADALSSTSNILASEFFEEEKQESPKSTFSPAPRPEMPGTV 
EVESTFLARI*NFIWKGFINMPSVAKFVTKAY?VSGSPEYLTEDI> 
PDSIQVGGR1SPQTVWDYVEKIKASGTKEIC/VRFTPVTEEDQI 
SYTLLFAYFSSRKRYGVAANNMKQVKDMYLIPLGATDKIDHPLV 
P FDG PGLELHRPNLLLGL 1 1 RQKLKRQHS ACASTSH I AETPES A 
PPIALPPDKKSXIEVSTEEAPEEENDFFNSFTTVLHKQRNKPQQ 
NIiQEDLPTAVEPLMEVTKQEPPKPI/RFLPGVLIGWENOPTTLEL 

ankplpvddilqsllgttgqvydq\aqsvmeqntvkbipflneq 

TNSKIEKTDNVEVTDGENKEIKVKVDNISESTDKSAEIETSWG 
SSS IS AGSLTSLS LRGKPPDVS TEAFLTNLS IQS KQEETVESKE 

KTLKRQLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 
L VANTARS PQ F I NLKRD PRQAAGRS Q P VTTS E S KDGDS CRNGEK 

HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 
PSVENIQTSQAEQAKPLQEDILMQNIETVHPFRRGSAVATSHFE 
VGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 
S P PG FP FPG PPNFPPQSM FGFP PHL P PPLLP P PGFG \ FA\ QNPM 

VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKD-RRPE 
RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 
G KASRDS RNVD KKPDKP KS ED YE KDKE RE KS KHREGE KDRDR YH 
KDRDHTDRTKSKR j 




3226 


639 


PPARRSARDLPRALSMEAARPSGSWNGALCRLL\LVTL\AFLIF 
ASDACKNVTLHVPSKLDAEKLVGRVNLKECFTAANLIHSSDPDF 
QI LEDGS VYTTNTI LLS S EKRS FT I LLSNTENQE K KK I F VFLEH 
QTKVLKKRHTKEKVLRRAKRRWAP I PCSMLENSLGPFPLFLQQV 
QSDTAQNYTI YYS I RGPG VDQEPRNLFYVERDTGNLYCTRP VDR 
EQYESFEIIAFATTPDGYTPELPLPLIIKIEDENDNYPIFTEET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYS I IGQVPPS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLK1KVQDMDGQYFGL 
QTTSTCIINIDDVNDHLPTFTRTSYVTSVEENTVDVEILRVTVE 
DKDLVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCVVKPL 
NYEEKQQMIIiOIGVVNEAPFSREASPRSAMSTATVTVNVEDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTI DENTGS I KVFRS LDREAET I KNG I YN I TVLAS DQG 
GRTCTGTLGI I LQDVNDNS PFI PKKTVT ICKPTMSS AEIVAVDP 
DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
GSYWPITVRDRIKJMSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVQLGKWAILAILLGIALFFCILFTLVCGASGTSKQPKVI PDD 
[iAQONLI VSNTE APGDDKVY 4 ? A^PTTriTUna c a rv-irnniri tr> r.^-. 

IKNGGQETIEMVKGGHQTSESCRGAGHHHTLDSCRGGHTEVDMC 
R YT YS E WHS FTQPRLGEES I RGHTL I KN 


5953 " 


330 


811 

i 


PLLCNPWPGWYWWVXQESEISKESQEMDARPKLDLGFKE<3QTTir| 
LCIGNITNKKGGASKPRTARGGGLSLLPPPPGGKVTIPPPSS /V 
fCLPSTNHVTPPS IPKSNHGGSDADILLDLDSPAPVTTPAPTPVS 
/SNDLWGDFSTASSSVPNQAPQPSNWVQF 


5954 


32 


2130 

1 

; 


?PPPPPKlJ\NMADLEAVIjADVSYLMAMEKSKATPAAkASKR"T^ 

PEPSIRSVMQKYIJVERNEITFDKIFNQKIGFLLFiCDFCLNEINE 

WPQVKFYBSIKEYEiCLDNEEDRLCRSRQIYDAYIMKELLSCSH 



411 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


* Cult CCU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, 0=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
H=Histidine. I=Isoleucine, K=Lysine, 
L=»Leucine,. M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5955 






PFSKQAVEHVQSHbSiCKQVTSTLFQPYIBtlCKSLKGDiFQKFM 
ESDKFTRFCQWKNVELNIHLTMNEFSVHRIIGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKRIKMKC^BTIJu^NERIMLSLVSTGDCPFI 
VCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSBKEMRFYA 
TBI iLGLEHMHNRFWYRDLKPANIIiLDEHGHARia \DLGLACD 
FS KKKPHAS VGTHGYMAP E VLQKGTAYDSS ADWFSLG CM L FKLL 
RGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPELKSLLEGLL 
QRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPP 
RGEVNAADAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERW 
QQEVTETVYEAVNADTDKIEARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TFKEAQRLLRRAPKFLNKPRSGTVELPKPSLCHRNSNGL 




1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR 
PANRQDVLSGWINLPVLQLTKDPLFCTPGRLDHGTRTAFIHHREQ 
VWKRCINIWRDVGLFGVLNEIANSEEEVFEWVKTASGWALALCR 
WASSLHGSLFPHLSLRSEDLIAEFAQVTNWSSCCLRVFAWHPHT 
NXFAVALLDDSVRVYNASSTIVPSLKHRLQRNVASLAWKPLSAS 
VLAVACQSC ILI WTLDPTS LSTP PS SGCAQVLSHPGHTPVTSLA 
WAPSGGRZ^SASPVDAAIRVWDVSTETCVPLPWFRGGGVTNLLW 
SPDGS KI LATTPSAVFRVWEAQMWTCEROTTbSGRCQTGCWS PD 
GSRLLFTVLGEPLIYSLSFPERCGEGKG\ALBVQSQQRIiWQICL 
ROQ YRHQMVRRG LGERLT PWSGT PVGNVWLCL 


5956 


1705 


139 


CiVGVRGARAlvuVTVQEKAAALN^ALliSPAHRPPGFSVAQKPPGA - 

TYVWSSI INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDVIFSHL 

IQNKYFGDVDIPRAKWRVCQALMDYKVFEAVPTKVFGKDKKPT 

FEDSSCSLYRFTTIPNQDSQLGKENKLYSPARYADALFKSSDIR 

SASLEDLWENLSLKPANSPHVNISATLSPQVINBVWQEETIGRL 

LQLVDLPLLDSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGILK 

a ysds oedewlsaa i dcs b yl pdqm wei srs fpeqpdrtdlvk 
ellfdaigryyssrepllnhlsdvhngiaellvngkteialeat 
qlllklldfqnreefrrllyfmavaanpsefklqkesdnrmwk 
rifskaivdnknls:<gktdllvlfl\mdhqkdvfkipgtl\hki 
vs \ vk \ lmai qngrdpnr dagy i ycqr i dqrd ysnntekttkde 

LLNLLKTLDEDSKLSAKEKKK\LLGQFYKCHPDIFIEHFGD I 


5957 


1479 


451 


ELQVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENI KNAMLIK " 

GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 

SECKSDCSL.FMFGSHNKKRPNNX.VIGRMYDYHVLDMIELGIENFV 

S LKD I KNS KCPEGTKPML I FAGDDFDVTEDYRRLKS L LI D FFRG 

PTVSNIRLAGLEYVLHFTAI1NGKIYFRSYKLI1LKKSGCRTPRIE 

LEEMGPSLDLVLRRTHLASDDLYKLSMKMPKALKPKKKKNISHD 

T?GTTYGRIHMQKQDLSKLQTRKM\KGLKKRPAERITEDHEKKS 

KRIKKKLMELSQPLLFHCVIiLKRIIKHQSIQSFL 


5958 


1 


3138 

( 
I 
I 
I 


AAALGMLLWFPACQAFNLDVEKLTVYSGPKGSYFGYAVDFHIPD 
ARTAS VLVGAPKANTSQPD I VEGGAVYYCPWPAEGS AQCRQI P F 
DTTNNR K I R VNGTKE PIE FKSNQW FG \ ATVKA\H KGKSCG P VA P 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 
EGQGYCQAGFSLDFYKNGDLIVGGPGSFYWOGQVITASVADIIA 
NYS FKD I LR KLAGE KQ TEVAPAS YDDS YLG YSVAAGEF TGDS QQ 
ELVAG I PRGAQNFGYVS I INS YDMTFIQNFTGEQMASYFGYTW 
VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLYLQVSSLL 

FRDPOILTGTETFnRFfiQJXMIlMT^riT M^IVVTVTnT nTnimnnnvTs 

QRGKVL I YNGNKDGLNTKPF PKFCQGVWAS HAVPSG FG FT LRGD 
SDIDKNDYPDLIVGAFGTGKVAVYRARPVVTVDAQLLLHPMIIW 
LENKTCQVPDSMTSAACFSLRVCAS VTGQS IANTIVLMAEVQLD 
5 LKQ KGAI KRTLFLDNHQ AHRVFPL VI KRQ KSHQCQD F I VYLR D 
STEFRDKLSPINISLNYSLDESTFKEGLEVKPILNYYRENIVSE 
}AHI LVDCGEDNLCVPDLKLSARPDKHQVI IGDENHLMLI INAR 
^EGEGAYEAELFVMIPEEADYVGIERNNKGFRPLSCEYKMENVT 
IMWCDLGNPMVSGTNYSbGLRFAVPRLEKTNMSINFDLQIRSS 
^ICDNPDSNFVSLQINITAVAQVEIRGVSHPPQIVLPIHNWEPEB 
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SEQ 
ID 
NO: 


beginning 

nucleotide 

location 

cor re s ponding 

to first 

amino AfHrf 

QUIA * k^mf ClWlU 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Nethionine, N=Asparagine, 
P=?roline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine / Unknown, *«»stop 
Codon, /=poseible nucleotide deletion, 
\spos8ible nucleotide insertion) 








ephxeeevgpuvehiyelhnigpstisdtilevgwpfsardefl 

LY I FH IQTLG P LQCQ PNPNI N PQDI KPAASPEDTP ELS A FLRJN S 
TI PHL VRKRD VHWE FHRQS PAKI LNCTN I ECLQ I S CAVGRLEG 
GESAVLKVRSRLWAHTPLQRKNDPYALASLVSF^EVKKMPYTDQP 
AKLPEGS I AI KTS VI WATPNVS FS I PLWVI ILAI LLGLLVIAI L 
TLALWKCGFFDRARPPQEDMTDREQLTNDKTPEA 


595S 


1 


1166 


GTSGYAAOQLPSLLKEREFHIjGTLNKVFASQHLNHRQWCGTKC " 
NTLFWDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNPSRT 
LLATGGDN PNS LAI YRL PTLDP VCVGDDGH KDW I FS IAW ISDTM 
AVSGS RDGSMGLWEVTDDVLTKS DARHNVSRVPVYAHI THKALK 
DI PXEDTNPDNCKVRALAFNNKNKELGAVSLDG YFHLWKAENTL 
SKLLSTKLPYCRENVCLAYGSEWSVYAVGSQAKVSFLDPRQPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFL 
EERLSACYGSKPRLAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 


870 


FVWSDGGPRPRRGPAVGAGAAKLSDPWAMTPGTANRATNPLNKE 
LD WAS ING FC EQLNE DFEG P PLATR LLAHKI QS PQEWE AI QALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYS WTVG LPEE VK I AEA YQMLKKQG \ I VKS D P KL P DDT 
TFPLPPPRPKNVIFEDEEKSKMLARLLKSSIIPEDLRAANKLIKE 
KVQEDQKRMEKIS KRVNAI EEVNNNVKLLTEMVMSHS QGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EAIiAEILQA 
NDNLTQVINLYKQLVRGEE VNGDATAGS I PGSTS ALLDLSGLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSLLDDELMSLGLSDPTP 
PSGPSLDGTGWNSFQSSDATEPPAPALAQAPSMESRFPAQTSLP 
ASSGLDDLDLLGKTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LESIKPSNILPVTVYDCHGPRILFHFARDPLPGRSDVLWWSM 
LS TAPQ P I R N I VFQS AVP KVMKVKLQP PSGTE L PAFN P I VH P S A 

ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


sgeprpepgnmatcigekiedfkvgnllgkgsfagvyraesiht 
glevaikmidkkamykagmvqrvqnevkihcqlkhpsilelyny 
fedsnyvylvlemchngemnrylknrvkpfsenearhfmhqiit 
gmlylhshg i lhrdltlsnllltrnmni kiadfglatqlkmphe 

KHYTLCGTPNYISPEIATRSAHGLESDVWSLGCMFYTLLIGRPP 
FDTDTVKNTLNKVVLADYEMPTFLSIEAKDLIHQLLRRNPADRL 
SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 
FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEBRYSPTDNNANIF 
NFFKE KTS SS SG S FERPDNNQALSNI ILC PG KT P FP FADP T PQTE 
TVQQWFGNLQINAHLRKTTEYDSISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDASDNAHSVKQQNTMKYMTALHSKPEIIQQECVF 
GSDPLS EQSKTRGM3PPWGYQNRTLRS I TSPLVAHRLKP I RQKT 

kkaws i ldseevcvelvke yasqeyvkevlqi ssdgntiti yy 
pngg\rgfpla\drppspt\dnisr\ysf\dnlpekywrkyqya 

S RFVQL VRSKS P K I T YFTR YAKCI LM ENS PGAD FE VW F YDGVK I 

HKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 

ICLALES I ISEBERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 

VDSNYPTRDRASFNRMVMHS AAS PTQAP ILNPSMVTN3GLGLTT 
TASGTDTSSNRTiKDPT.DvcariT t ifcat/cn/ vxrxwvj ?i mrv\ r tnno« tn i 

VQFNDGSQLWQAGVSSISYTSPNGQ\TTR\YGENEKLPDYIKQ 
KLQCLSSILLMFSNPTPNFH 


5962 


20 


2447 


RVCSSS AS TASQAVMADAWE E IRRLAAbFQRAd FAEATQRLS E R 
NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GRVN I VDLQQ VI NVDL I H I ENR IGDI I KS E KHVQL VLGQLI DEN 
YLDRLAEEVNDKLQESGQVTISELCKTYDLPGNFLTQALTQRLG 
R I ISGH I DLDNRGVI FTEAFVARHKAR IRGLFSAITRPTAVNSL 
I S KYGFQEQLL YS VLEEL VNSGRLRGT WGGRQDKA VFVPD I YS 
RTQSTWVDSFFRQNG YLEFDALSRLGI PDAVSYIKKRYKTTQLL 
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SEQ 
ID 

MO: 


Predicted 
.beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
1 sequence 


Amino acid secment containina <i{ern»l nonhiX: — 
(A= Alanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, f- Phenyl alanine, Glycine, 
H=Histidine, I=»Isoleucine, K=Lysine, 
L«Leucine, M=Methionine , N=Asparagine. 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


5953 






FLKAACVGQGLVDQVEASVEHAISSGTWVDIAPLLPTSLSVEDA 
AILLQQV^FSKQASTWFSDTVWSEKF\INDCTELFREI^IH 
QKAEKSMKNNPVHLITEEDLKOI STLESVSTSKKDKKDFRR R 
TEGSGS tMRGGGGGN ARE YKI K KVKKKGRKDDDSDDES Q S S HTGK 
KKPEISFMFQDEIEDPLRKHIQDAPEEFISELAEYLIKPLNKTY 
LEWRSVFMSSTTSASGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALTKHLLKSVCTDITNLIFNFLASDLMMAVDDPA 
AITSEIRKKILSKLSEETKVALTKLHNSLNEKSIEDFISCLDSA 
AEACDlMVKRGDKKRERQrLFQHRQALAEQLKVTEDPALILHLT 
SVLLFQFSTHSMLHAPGRCVPQI IAFtiNSKI P EDQHALL VKYQG 
LWKQLVSQS KKTGQGDY P LNNELDK3QEDVAS TTRKELQELSS 
SI KDLVLKSRKSSVTEE 


" 5$64 


62 


f 1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAPGMP\GLMGSN I 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGAIXjSPGEPGY 
MGLPGIQGKKGDKGNQGEKGIQGQKGEN3RQGIPGQQGIQGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GP PGLDGKPGRE FSEQF IRQ VCTDVIRAQLPVLLQSGR IRNCDH 
CLS QHG5 PG I PG P PG P I G P EG PRG L PGLPGRDG VPGL VG VP GRP 

GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 

PPGDPGLPGKDGDHGKPGIQGQPGPPGICDPSLCFSVIARRDPF 
RKGPNY 1 




3 


2147 


SCRTRGRL^PLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTKI 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGE VTF ENVKE I FGQTI IHHH I P FNWDC E F I RLHFG HNR 
KKHLNYTEFTQFLQELQLEHARQAFALKDKSKSGMISGLDFSDI 
MVT I RSHMLTP FVE ENLVS AAGGS I S HQ VS FS YFNA FNS LLNNM 

ELVRKIYSTLAGTRKDAEVTKEEFAQSAIRYGQATPLEIDILYQ 
LADLYNASGRLTLADIERIAPT.APf;aT,l3VNrr.nwT nonncncT 

PIWLQIAESAYRFTLGSVAGAVGATAVYPIDLVKTRMQNQRGSG 
SWGELMYKNSFDCFKKVLRYEGFFGLYRGLIPQLIGVAPEKAI 
KLTVKDFVRDKFTRRIXSSVPLPAEVIiAGGa^SQVIFTNPLEI 
VKIRLQ VAGE I TTG PR VSALNVLRDLGI FGLYKGAKAC FLRD I P 
FSA I YFPVYAHCKLLLADENGHVGGLNLLAAGAMAG \ VPAASLV 
TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAARVFRSSPQFG\VTLVTYELLQRGFYIDFGGLKPAGSEPTPK 
5 RI ADLP PANPDH I GG YRLATAT FAG I ENK FGLYLP KFKS PS VA 
WQPKAAVAATQ ** | 


5965 
59tf " 


1 


1498 


MVTWIjYRFLPTSNMAAJajRSLLPPDLRLQFWLHARLQKCFl,SRG| 
CGSYCAGAKASPLPGKMAMGLMCGRRBLLRLLQSGRRVHSVAGP 
SQWXiGKPLTTRLLFPAAPCCCRPHYLFLAASRPP <?r,QTQaTc:pa 

EVQVQAPPWAATPS PTAVPEVASGETADWQTAAEQS FAELGL 
GS YTPVGLIQNLLEFMHVDLGLPWWGAIAACTVFARCLI FPLIV 
TGQREAAR IIINHL PE I Q KFSS R I REAKLAGDH IE YYKAS S EMAL 
YQXKHGIKLYKPLILPVTQAPIFISFFIALREMANLPVPSLQTG 
GLW WFQDLTVS DP I Y I L PLAVTATMWAVLELGAETG VQS S DLQ W 
MRNVIRMMPLITLPITMHFPTAVFMYWLSSNLFSLVQVSCLRIP 
AVRTVLKI PQRWHDLDKLPPREGFLES FKKGWKNAEMTRQLRE 

REQRMRNQLELAARGPLRQTFTHNPLLQPGKDNPPNIPSSVSSS 
SSKPKSKYPWHDTLG 




102 


1925 

< 


RSKQVMARLTKRRQADTKAIQHLWAAIEIIRKQKQIANIDRITK | 

YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 

GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 

S DE FRLRDS S S P WQCP VCRS I KKKNTNKQEMGT YLR FIVSRMKE 

RAI DLNKKG KDNKHPM YRRLVHS AVDVPT IQEKVNEGKYRS YEE 

FKADAQLLLHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 

KNCFYLANARPDNWFCYPCIPNHELDMAKMKGFGFWPAKVMQKE 

DNQ VDVRF FGHHHQRAW I PS ENI QD I TVN I HRLHVKRSMG WKKA 

CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 

3EPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 

3VSTQTKKLSASS PR^HRSTQTTNDGVCQSMCHDKYTKI FNDF 
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SEQ 
ID 
NO: 



5967 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



102 



predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5968 



81 



1925 



1288 



5969 



1126 



"5970 



533 



4712 



Amino acid segment containing signal peptide" 
<A=Alanine, C=Cysteine, D-Aspartic Acid, E = 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
HaHietidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, 0=Glutamine, ReArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 
KDRHKSDHKR5TKRV VREALE KLR ijKMEEBKRQAVWKAVANMQG 
BMDRKCKQVKEKCKEEFVEEIIOCLATnHirnT.TQrY,.vvv^ v «^ 



— ▼ ivanuL ^UKtofcWKlSiSiCi<QAvNKAVANMQG 

EMDRKCKQVKE KCKEEFVEE I KKLATQH KQL ISQTKKKQWCYNC 
BEEAMYHCCWNTS YCS I KCQQEHWHAEH KRT CRRKR 

RSKQVMARLTKKRQADTKAIQHLWAAIEIIRNQKQIANIDRITK" 
YMS R VHGMH PKETTRQLS LAVKDGLI VETLfVGCKGS KAGI EQE 
G YWLPGDE I D WE TENHDW YC FE CHL PG EVL I CDLCFR VYHS KCL 
SDEFRLRDSSS PWQCPVCRSI KKKNTNKQEMGTYLRFI VSRMKE 
RAIDLNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEQADIARWLYKDTCHEL\DELQLC 
KNCFYU^PD^FCYPCIPI^ELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFVJKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVSTQTKXLSASS PRMLHRSTQTTNDGVCQSMCHDKYTKI FNDF 
KDRNKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKOIjISQTKKKQWCYNC 
EEEAM YHCCWNTS YCS I KCQQ EHWH AEHKRTCRR KR 
VKFPRRGGAPPT VLTPGRQQGVFLG PQRP 



*■ * ^»^wcvr lAjfVKFUSEPDIPARGQPHPP 

rpvgvstsaqaqvqppamhrrrlalglgfcliactslsvlwvyl 

BNWLPVSYVPYYLPCPEIFNMKLHYKREKPIKJPWWSQYPQPKL 

LEHRPTQLLTLTPWLAPIVSEGTFNPELIiQHIYQPLNLTIGVTV 

FAVGN/HFLESAEEFFKRGYRVHYYIFTDNPAAVPGVPLGPHRL 

LSSI P 3 QGHSHWEETSMRRMETISQHI AKRAKREVDYLFCLDVD 

MVFRNPWGPETLGDLVAA1HPSYYAVPRQQFPYERRRVSTAFVA 

DSEGDFYYGGAVFGGQVARVYEFTRGCHMAIIADKANGIf^AAWR 

EESHLNRHFrSNKPSKVLSPEYLWDDRKPQPPSLKLIRFSTLDK 
DISCLRS 



U VU FN I KRKRCJUliD VF LES PR KPSGRRDRAPEKQRRI AANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIWLFGISITGGLFYTl 
FKELFSSSSPSKIYGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGIiKHTCVKFYIEGSEPGKQGTVYAQVKENP 
GSGE YDFRYIFVEIESYPRRTI I I EDNRS QDD 

sqdnighrljoukhgwklgwixsi^Lqgrtdpipivvkydvmgmg 

RMEMKLDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDLRANFYCELCDKQYQKHQEFDNHINSYDHAHKQRLKDLK 
QREFARNVSSRSRKDEKKQEKALRRLHEIiAEQRKQAECAPGSGP 
MFKPTTVAVDEEGGEDDKDESATNSGTGATASCGLGSEFSTDKG 
GPFTAVQITWTTGLAQAPGLASQGISFGI KNNLGTPLQKLGVS F 
S FAKXAP VKLES I AS VFKDHAE EGTS EDGTKPDE KSS DQG LQK V 

GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEGAGATE 
PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 
KKGS S PK P KS C I KAAAS QGAEKT VS E VS EQP KETSMTE PS EPGS 

KAEAKKALGGDVSDQSLESHSQKVSETQMCESNSSKETSLATPA 
GKESQEGPKHPTGPFFP VLSKDES TALQ WPS ELL I FTKAEPS IS 

YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 
PNKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 
TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 
PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 
KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 
KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 
DASSDQSCYSRQRSYSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 
SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 
SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 
KRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPBDKNSVTAKLLL 
EKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 
LGNKPVLPLIGKLPATRKPNKKCEESGLERGEBQEQSETEEGPO 
GSSDALFGHQFP\SEETTGPLLDPPPEESKSGBVTADHPVAPLG 
PPAHFDCYLGDPTISHNYLPDPSDGNTLESLDSSSOPGPVESSL 
LPIAPDLEHFPS YA P PSGDPS I ESTDGAEDA\SIjAPLESQP I TF 
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ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5971 



53 



5972 



440 



5973 



65 



Predicted end 
nucleotide 
location 
co rre sp onding 
to first 
amino acid 
residue of 
amino acid * 
sequence 



2149 



1761 



• 2007 



~59W 



"4293- 



2200 



Ammo acid segment containing signal peptide" 
(A=Alanine, OCysteine, D-Aspartic Acid, B= 
Glutamic Acid, F> Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=»Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S^Serine, T= Threonine, v=Valine, 
N=Tryptophan, Y^Tyrosine, X«Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



, c- — llu ^ cu ^"a J.nser cjLon; 
TPEEMBKYSKLQQAAQQHI QQQIxLAKQVKAFPA5AALAPATPAD 
Q P I HIQQ PATAS ATS I TT VQHAILQHHAAAAAAAI G IHPHPHPQ 
PLAQVHHI PQ PHLTP I S LSHLTHS I 1 PGHPATFLASHPIHI I PA 
S AI HPG P FTFHP VPHAALY PTL LAPR P AAAAATALHLH PLLH P I 
FSGQDLQHPPSHGT 

S FL YFVG VuMurgp i GNWDGRFDG VQLCS FA CVEST I LIiHIND II 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RSHLFYTLNGSSVDSQPQSKSKNTWYIDEVAEDPAKSLTEISTD 
FDR3SPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQS VMEELNTAP VQES P PIiAMPPGNSHGLEVGSLAE VKENP 
PFYGVIRWIGQPPGLNSVLAGLELEDECAG\CTDGTF/REGTRY 
FTCALKKALFVKLKSCRPDSRFASLQPVSNQIERCNSLAIWBAY 
LSE\A^ENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTLF 
CLFAFSSVLDTVLLRPKEKNDVEYYSETQELLRTEIVNPLRIYG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHILRV 
BPLLK1RSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 
SNLKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICX5GLAMYECRECYDDPDTSAGKIKQFCKTCNTQVHLHP 
KRLNHKYNPVSLPKDLPDWDWRHGCIPCQNMELFAVLCIETSHY 
VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYL 
KMSLEDLHSLDSRRIQGCARRLIiCDAIYVPCTQ SPTMSLYK 
ILLAGSPSPRDQCSQRQSSGGDKEIjV TRGCTFSTAWSPSAMTQ ' 
EPFREELiAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PSAIVSFTVSRRNANVIPNFQILFVSTFAVTTTCLIWFGCKLVL 
NPSAININFNLILLLLLELLMAATVIIaARSSEEDCKKKKGSMS 
DSANILDEVPFPARVLKSYSWEVIAGISAVLGGIIALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TSPLLFTASGYLSFSIMRIVEMFKDYpPAIKPSYDVLIiLLLLLV 
LLLQA/GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 
GQB P PEG VRQGE S LES RRGANG PVTP RRGNR VAA PS LAPGMETH 
NP 

NGDGKDLFGHIWAWRSNGIISNFRRS PHAGMAEDEPDAKSPKTG 

GRAPPGGAEAGEPTTLLQRLRGTISKAVQNKVEGILQDVQKFSD ' 

MDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 

HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKIIREIFPDI 

KARRLGGRGQSKYCYSGIRRKTXiVSMPPLPGLDLKGSESPEMGP 

EVTPAPRDELVEAACALTCDWAERILKRSFSSIVEVARFLLQQH 

bISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 

KKPERLAQPPK0LEARTGAGPLARGBRKKSVVESSAPGANNLQV 

NALVARLPLLLPRAPRSLIPPIPVSPPILAPRLSSGALKVATLP 

LS S RAG A P PAAVP I INM I LP TV PALPG PG PGPGRA PPGGLTQ PR 

GTENREVGIGGDQGPHDKGVKRTAEVPVSBASGQAPPAKAAKQD 
1EDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
P WETWG SGGEGNS AGG AER PG PMGEAE KG AVLAQG \ QGDGT VS K 

GGRGPGSQHTKEAEDKIPLVPSKVSVIKGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQS5LSQEHKDP KATPP 

LGLQMHTTSGRXHQAMVTS'LNEDNESVT VEWIEN^DTKGK\Blb 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RD1TRVVGSARARPSQFPEQFSSAQQNGSV\S 
D I SP VQAAKKE FG P PS RR KSNC VKEVE KLQE KRE KRRLQQQELR 
EKRAQDVDATN PNYE I MCM IRDFRGSLDYRPLTTADP I DEHR IC 
VCVRKRPLNKKETQMKDLDVITI PSKDVVMVHEPKQKVDLTRYL 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVETIFERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCSKGI YALAARDVFLMLKKPNYKK 
LELQ V YAT F F E I YSG KVFDLLNR KT KLR VLEDGKQQVQ VVGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECIRAIiGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LBTQWGVGSS PQRDDLKLLCEONEEEVSPQLFTFHEAVSQMVEM 
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SEQ 
ID 

NO: 


r reaiccea 
beginning 
nuclpotiriA 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~" 
(A=Alanine, C=Cysteine, D°Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Hietidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
SsSerine, T=Threonine, V=Valine, 
K«Tryptophan, Y=Tyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEQ WEDHRAV FQES I RWLEDEKALLEMTEEVDYD VDS YATQLE 
AILEQKIDILTELRDKVKSFRAAIXJEEEQASKQINPKRPRAL 


5975 


4293 


1 2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID " 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
7V\ASIKNDPPS\RDNRWGSARARPSQFPEQPSSAQQNGSV\S 
D I S P VQAAKKEFGP PS RRKSNCVKEVBKLQEKREKRRLQQQELR 
EKRAQDVDATNPNYEI MCM I RD FRGSLD YRPLTTADP I DEHRI C 
VCVRKRPLNKKETQMKDLDVITI PSKDWMVHEPKQKVDLTRYL 
ENQTFR FDYA FDDS APNEMVYR FTARPLVETI FERGMATCFAYG 
QTG5GKTHTMGGDFSGKNQDCSKGIYALAARDVFLMLKKPNYKK 
LELQVYATFFEIYSGKVFni.T.NRKTKLRVLEDGKQQVQWGLQE 
RE VKCVEDVLKL I D IGNS CRTSGQTS AN AHS S RSKA V FQ 1 1 LRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRL2GAEINKSLLALK 
EC I RALGRNKPHTPFRASKLTQVLR DS FIGENSRTCM IATIS PG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LETQWGVGSS PQRDDLKLLCEQNEEEVS PQLFTFHEAVS QMVEM 
EEQWEDHRAVFQES I RWLEDEXALLEMTEEVDYDVDS YATQLE 
AI LEQKI D ILTELRDKVKSFRAALQEEEQAS KQINPKR PRAL 


5976 


20 


2949 


vhhlhltrvsvwnldiilriaqqmgiktlnlvlg\lkra\lef 

P E VS WME V KD PNMKGAMLTNTGKYAI PT I DA\ EA YAIG KKE KP P 
FLPEEPSSSSEEDDPIPDELLCLICKDIMTDAWIPCCGNSYCD 
BCIRTALLESDEHTCPTCHQNDVSPDALIANKFLRQAVNNFKNE 

tgytkrlrkqlpsppppippprpliqrnlqplmrspisrqqdpl 
mipvtsssthpapsissltsnqsslappvsgnpssapapvpdit 

ATVSISVHSEKSDGPFRDSDNKILPAAALASEHSKGTSSIAITA 

lmeekgyqvpvlgtpsllgqsllhgqlipttgpvrintarpggg 
rpgwehsnklgylvsppqqirrgerscyrsinrgrhhsersqrt 
cgpslpatpvfvpvpppplypppphtlplppgvpppqfspqfpp 
gqp \p pagys v? ppgfp papanlstpwvssgvqtahsnti PTTQ 
applsreefyreqrrlkeeekkkskldeftndfakelmeykkiq 

KERRRSFSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 
RflHSRSYSRSPPYPRRGRGKSRNYRSRSRSHGYHRSRSRSPPYR 

ryhsrsrspqafrgqspnkrnvpqgetereyfnryrevpppydm 
kayygrsvdfrdpfekeryrewerkyrewyekyyxgyaagaqpr 
psanrenfsperft.plnirnspftrgrredyvggqshrsrnigs 
nypeklsardghnqkdntkskekesenapgdgkgnichkkhrkrr 

KGEBSEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDEPMDAESITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKBNIVKPAKGPQEKVDG\DVRDLLDLNL\QLKKPKEETPKDL 

tilnhhlplrrmkksl\epp\ekltlnqqk\tprnktsqrgkse 

EGLFQRCQIRKANN 


5977 


1363 


133 6 


FLEDRGQVLSHFQCLSLHSINHILHPGAGVAAGPAtGW/REYL^ 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAIIEEDDGDGGtW 
DTYHNTG ITG I TEAVKE ITLENKDNIRLQDCSALCE EEEDEDEG 
EAADMEEYBESGLLETDEATLDTRKIVEACKAKTDAGGEJDAILQ 
TRT YDL Y I TYDKYYQTPRLW L FG YDEQRQ PLTVEHM YEDI S QDH 
VKKT VT I ENH PHLP P PPMCS VHPCRHAE VMKK 1 1 ET VAEGGG E L 
GVHMYLLI FLKFVQAVIPTIEYDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQSPLTWAPGFYRRFDLATSGRRLRGQTAEPAGRQ 

RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEOHAOCPHCRAPT.OT.RRT.VHPPijappvTv^rir t\tt s\ 

LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMH 
GGHTFKPLAEIYBQHVTKVNEEVAKLRRRLMELISLVQ2VERNV 
EAVRNAKDE R VRE I RNAVEMMIARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSELISKSSEIt^lMFQQVHRKPM 
AS FVTTPVP PDFTS ELVPS YDS ATFVLENFSTLRQRADP VYSPP 
LQVSGLCWRLfCVYPDGWGWRGYYLSVFLELSAGLPETSKYEYR 
VEMVHQS CNDPTKN I IREFASDFEVGECWGYKRFFRLDLLANBG 
YLNPQNDTVlIiRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
INNLKERLTIELSRTQKSRDLSPPDNHIiS PQNDDALETRAKKSA 
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SEQ 
| ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

corre sponding 

to first 
1 amino acid 
1 residue of 
1 amino acid 

sequence 


Amino acid segment containing signal peptide ' 
{A=Alanine, CoCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L« Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine. X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER \GP YSAS \VREAKEDEEDEEKIQNEDYHKELSDGDL 
DLDLVYEDEVNQLDGSSSSASSTATSNTEENDIDEETMSGENDV 

•* """^oouCilJl^UAAft/Uji'AuiSnGYVGS 
ATSSLLDIDPliILIHLLDLKDRSSlENLWGLQPRPPASLLQPTA 
S YS R KDKDQR KQQAMWR VPS DLKMLKRLKTQMAE VRCMKTDVKN 
TLSEIKSSSAASGDMQ7SLFSADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DRQCKALDS DA WVAV F SG LPAVE KRRKMVTLG ANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAP3EGMSSDSDIECDTENEE 

CEEHTSVGGFHDSFMVMTQPPDEDTHSSFPDGEOIGPEDLSFNT 
DBNSGR 


5979 


212 


| 3665 


LPDMTMYbWLKLLAFGFAFLDTEVFVTGQSPTPSPTDAYLNA^E 

TTTLSPSGSAVISTTTIATTPSKPTCDEKYANITVDYLYNKETK 

LFTAKLNVNENVEOGNNTCTNNEVHNLTECKNAS VS I SHNSCTA 

PDKTLILDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 

DTQNITYRFQCGNMIFDNKEIKLENLEPEHEYKCDSEILYNSHK 

FTNAS KI I KTDFGS PGE PQ 1 1 FCRS EAAHQG VI TWN PPQ R5 FHN 

FTLCYI KETBKDCLNLDKNbl KYDbQNLKP YTKYVLSLHAY I T A 

KVQRNGSAAWCH FTTKSAP PSQVWNMTVSMTSDNSMHVKCR PPR 

DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 

AYF1WGDYPGEPFILHHSTSYNSKALIAFLAFLIIVTSIALLVV 

LYKI YDLHKKRS CNLDEQQELVERDDEKQLMNVEP IHADI LLET 

YKRKI ADEGRLFLAE FQS I PRVFS KFP I KEARKP FNQNKNR YVD 

ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 

DETVDDFWRM2WEQKATVIVMVTRCEEGNRNKCAEYWPSMEEGT 

RAFGECCCKDLTKHKRCP\DYIIQKLNIVNKKEKATGREVTHIQ 

FTS WPDHG VPE D PH LLLKLRR RVNAFSN FFSGP I WHCS AG VGR 

TGTYIGIDAMLEGLEAENKVDVYGYWKLRRQRCLMVQVEAQYI 

LIHQALVEYNQFGETEVNLSELHPYLHNMKKRDPPSEPSPLEAE 

FQRLPSYRSWRTQHIGNQE\ENKSKNRNSNVIPYDYNRVPLKHE 

LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 

AAQGPLKETIGDFWQMI FQRKVKVI VMLTELKHGDQE I CAQYWG 

EGKQTYGDI E VDLKDTDKS S TYTLRVFE LRHSKR KDS RTVYQ YQ 

YTNWSVEQLPAEPKELISMIQWKQKLPQKNSSEGNKHHKSTPL 

L IHCRDGSQQTG I FCALLN LLE S AETE E WDI FQ WKALRKAR P 

GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 

KVKQDANCVNPLGAPEKLPEAKEQAEGS E PTSGTEGPEHS VNG P 
ASPALNQGS 


5980 


3 r 


2363 


DAWGCKLRRLRFTYGTQTRVSLALPGQYELVHTLVAHQGNWETI 

P EEDLE VQENNED AAHDLTELE VTMHHALLQ E VD VWAP OQGLR 

PTVDVLGDLVNDFLP VITYALHKDELS ERDEQELOE I RKYFS FP 

VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 

CGAPGQDTKAQSMLVEQSEKLRHliSTFSHQVLQTRLVDAAKALN 

LVHCHCLDIFINQAFDNX3RDLQITPKRLEYTRKKENELYESLMW 

IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 

VGTREIKCCIRQIQELIISRLNQAVANKLISSVDYLRESFVGTL 

aK^i^toijai^yuvbVHIlbNYLKQIIjNAAYHVEVTFHSGSSVT 

MLWEQIKQIIQRITWVSPPAITLEWKRKVAQEAIESLSASKLAK 

SICSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 

DHAPRLARLSLES RS LQDVLLHRK P KLGQ E LGHGQ YG WYLCDN 

WGGHFPCALKSVVPPDEKHWNDLiALEFHYMRSLPKHERLVDLKG 

S VIDYNYGGGSS IAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 

DWEGIRFLHSQGLVHRDIKLKNVLLDKQNRAKITDLGFCKPEA 

MMSGSIVGTPIHMAPELFTOKYDNSVDVYAFGILFWYICSGSVK 

LPEAFERCASKDHLV7NNVRRGARPERLPVFDEECWQLMEACWDG 

DPLKRPLLGIVQPMLQGIMNRLCKS\NSEQPNRGLDDST 


5981 


1 j 


2S19 


GHKHSAAMEKPWGAADGLSRWPHGLGLLtLLQLLPPSTLSQDRL 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
3\EDEECGRVRDFVAKLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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SEQ 
ID 
NO: 



5982 



-598T 



5984 



5985 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



I Predicted end 

nucleotide 

location 

corresponding 
I to first 
1 amino acid 

residue of 
| amino acid 

sequence 



~5T 



2316 



T7?T 



755 



1193 



22 



1408 



Ammo acid segment containing signal peptide 
(A=Alanine, CsCysteina, D=Asoartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I»Isoleucine, K=Lysine, 
I L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y -Tyrosine, X=Unknovn, *»Stop 
Codon, /=possible nucleotide deletion, 
\°possible nucleotide in sertion) 
T E FGMAI G P EKSG KWLTAE VSGGS RGGR 1 FRS S D FAKNF VQTD 
I LP FH P LTQMMY S PQNS DYLLALS TENG LWVS KN FGGKWEE I H KA 
VCrAKWGSDNTIFFTTYANGSCKADLGALBLWRTSDLGKSFfCTI 
GVKIYSFGLGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 
SVGQEQFYS ILAANDD.WFMHVDEPGDTGFGTI FTSDDRGI VYS 
KS LDRHL YTTTGG E TDFTNVTS LRG VY I TS VLS EDNSIQTMITF 
DQGGRWTH LRK PENS ECDATAKNKNECS LH IHAS YS I SQ KLNV P 
MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EGPHYYTILDSGGIIVAIEHSSRPINVIKFSTDEGQCWQTYTFT 
RDPIYFTGLASEPGARSMNISIWGFTBSFLTSQWVSYTIDFKDI 
LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 
QNGRD Y WTKQPS I CLCS LE D F LCDFG YYR PSNDS KCVEQPEL K 
GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQNSKSNSVPI ILAIVGLMLVTVVAGVLIVXKYVC 

GGRFLVHLYSVLQQH\AEA\NGVDGVDALDTASHTNKSGYHDDS 
DEDLLE 

atrpprgsswcrqfsrta saapgrsnmlripvrkalvglskspk" 

GCVRTTATAASNLIEVFVDGQSVMVEPGTTVLQACEKVGMQIPR 
FCYHERLSVAGNCRMCLVEIEKAPKWAACAMPVMKGWNILTNS 

ekskkaregvmefllanhpldcpicdqggecdlqdqsmmfgndr 

SRFIiEGKRAVEDKNIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 

lgttgrgndmqvgtyiekmfmselsgniidicpvgaltskpyaf 
tarpwetrktesidvmdavgsniwstrtgevmrilprmhedin 
eewisdktrfaydglkrqrltepmvrnekglltytswedalsrv 
agmlqs fqgkdvaaiagglvdaealvalkdllnrvdsdtlcteb 
vfptagagtdlrsnyllnttiagveeadwllvgtwprfeaplf 
narirkswlhndlkvaligspvdltytydhlgdspkilqdiasg 
shpfsqvlkeaxkpmwlgssalqrndgaailaavssiaqkirm 
tsgvtgdwkvmnilhriasqvaaldlgykpgveairknppkvlf 
llgadggcitrqdlpkdcfiiyqghhgdvgapiadvilpgaayt 
eksatyvntegraqqtkvavtppglaredwkiiralseiagmtl 
pydtl\dqvrnrleevspnlvryddieg\anyfqqanelsklvn 

(^QLLADPLVPPQLTMKDFYMTDaiSRASQTMAKCVKAVTEGAQA 

eargdggrrrhrasgrragrgep\ag lksqgqravpkravargg 
rq\ysaaiallepagseiaddlsilysnraacylkegncsgciq 
dcnral2lhpfsmkpllrramayetleqygkayvdyktvlqidc 
glqlandsvnrlsrilmeldgpnvjreklslipavpasvplqawh 
pakemiskqagdssshrqqgitdektfxalkeegnqcvndknyk 
dalskyseclkinnkecaiytnralcylklcqfeeakqdcdqal 
qladgnvkafyrralahkglknyqkslidlnkvilldpsiieak 
meleevtrllnlkdktapfnkekerrkieiqevnegkbepgrpa 
gevstgclasekggkssrspedpeklpiakpnnayefgqiinal 
strkdkeacahllaitapkdlpmflsnklegdtfllliqslknn 
liekdpslvyqhllylskaerfkmmltliskgqkelieqlfedl 
sdtpnnhftlediqalkrq yel 

ssvcmactyvsnlgkkqrsvsflasglmrvstcpelrlhhsfvl" 

I tgdvgrricrllvglftkgdtsskrvhpfspgpcfllcdlarvg 

sspkinvspfyqn\qtstqrsctvfvwqrcslvgpfqvtvftmy 
fhhslrsisrfssg 

rrvarpgtakpakarrtVrrgrarrdlagaerkagvsergdsgr ' 

RRPNPS I PSAAAGMSHIQI PPGLTELLQGYTVEVLRQQP PDLVE 

faveyftrlrearapasvlpaatprqslghpppepgpdrvadak 
gdseseededlevpvpsrfnrrvsvcaetynpdeeeedtdprvi 
hpktdeqrcrlqeackdillficwldqeqlsqvldamferivkad 
ehvidogddgdnfyviergtydilvtkdnqtrsvgqydnrgsfg 

E LALM YNTP RAAT I VATS EG S LWGLDR VT FRR 1 1 VKNN AXKRKM 
, FESFIESVPLLKSLEVSERMKIVDVIGBKIYKR/DGERIITQGE 
I K\ADSFYII2SGEVSILIRSRTKSNKDGGMQEVEIARCHKGQYF 

GE LALVTN KPRAASA Y AVGDVKC LVMDVQA FERLLG PCMD IMKR 
i NISHYEEQLVKMFGSSVDLGNLGQ 
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SEQ 
ID 

NO: 


"Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HsHietidine, Ioisoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5966 


180* 


484 


DAWKSTS LTPHWKLWGRHRGRRRGbAHPKNHLSPOQGGATPQVp- 
fa r<-LK* US PRGPPPPRIjGLLGALMAEDGVRGS ppvpsgppmeed 
GLRWTPKS P LDPDSGLIiS CTLPNGFGGQSG PEGER S LAP PDAS I 
LISNVCSIGDHVAQEliFQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQS I LDEFLQT\ YGSLI PLSTDEWEKLED I FQQEPST P 
S R KG LVLQL I QS YQRMPGNAMVRG FR VAYKRH VLTMDDLGTL YG 
QNWLNDQ VMNM YG DL VMDT VP EK \ VHF FNS FF Y \D KLRTKG YDG 
VKRWTKNVDlFNKBLLLIPIHLEVHWSLISVDVRRRTirYFDSO 
RTLNRRCPKHIAKYLQAEAVKKDRLDFHQGWKGYFKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGIiAHPKNHLSPOQGGATPQVP "' 

SPCCRFDSPRGPPPPRUSLLGAltMAEDGVRGSPPVPSGPPMEED 

GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGBRSLAPPDASI 

LISNVCSIGDHVAQELFQGSDLGMAESAERPGEK\AGQHSPLRE 

EHVTCVQS1LDEFLQT\YGSLIPLSTDEWEKLEDIFQQEFSTP 

SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 

QNV/LNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKIiRTKGYDG 

VKRWTKNVDIFNKELLLIPIHLEVHWSLISVDVRRRTITYFDSQ 

RTLNRRCPKHIAKYI^AEAVKKDRLDFHQGWKGYFKMNVARQNN 

DSDCG AFVLQ YCKHLALS Q P FS FTQQDMPKLRRQ I YKEIi CHCKL 

TV , 


5988 


1292 


410 


FKKYFLS FLGLLES SHSRDR I HNLVLMFLLATHNLVWWFTCRFQ 
RLDCIYLNAGIMPNPQLNIKALLFG1>FS\AEGLLTQGDK1TADG 
LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLEDFQHSKGKEPYSSSKYATDLLSVALNRNFMQQGLYSNVAC 
PGTALTNLTYG ILPPFI WTLLMPA I LLLRFFANAFTLTPYNGTE 
ALVWLFHQKPESLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 
KFYQKLLELEKH I RVTIQKTDNQARLSGS CL 


59B9 


194 


2G10 


AMDFPQHSQHVLEQLNQQRQLGIiLCDCTFWDGVHFKAHKAVLA 
ACSEYFKMLFVDQKDVVHLDISNAAGLGQVLEFMYTAKLSLSPE 
NVDDVt, \ AV7ATFLQMQD 1 1 TACHAL KS LAE PATS PGGNAEALAT 
EGGDKRA K EEKVATSTLS RLEQAGRST PIG PSRDL KE ERGGQ AQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
E QEME VE PARKGE EEQKEQEEQEEEGAG PAEVKEEGSQLENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCGKEFTHTGNFKRHIRIHTGEKPFSCRECSKAFSDPAACK 
AHEKTHSPLKPYGCEECGKSYRLISLLNLRKKRHSGEARYRCED 
CG KJjFTTS GNLKRHQLVHSGEK P YQCD YCGRS FS DPTS KMR HLE 
THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKC2ECGKQ 
FTTSGNLKRHLRIHSGEKPYVCIHCQRQFADPGALQRHVRIHTG 
EKPCQCVMCGKAFTQASSIiIAHVRQHTGEKPYVCERCGKRFVQS 
SQIiANH I RHH DN I RPH KCS VCS KAFVNVGDLS KH 1 1 1 HTGE KP Y 

LCDKCGRGFNRVDNLRSHVKTVHQGKAGIKILEPEEGSEVSWT 
VDDMVTLATEALAATAVTQLTVVPVGAAVTADETEVTjKAEISKA 
VKQVQEEDPNTHILYACDSCGDKFLDANSLAQHVRIHTAQALVM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


fgpgpdsgggargsgwgsrsqapygtlgavsggeqvllheeac5d~ 

SGFVSLSRLGPSLRDKDLEMEELMLQDETLLGTMQSYMDASLIS 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQ3PPPQQRSDGEEEEEVASFSGQILAGELDNCVSSIPDFP 
MHLACPEEEDKATAAEMAVPAAGDES I SSLSELVRAMH P YCLPN 
LTHLASLEDE LQEQPDDL'ITjPEGCWLEI VGQAATAGDDLE I P V 

WROVS PGPR pvllddsletssalqllmptleseteaavpkvtl 
csekeglslnseekldsacllkprewepwpkepqnppanaap 
gsqrarkgrkkkskeqpaacvegyarrlrsssrgqstvgtevts 
qvdnlqkqpqeelqkesgplqgkgkprawarawaaalensspkn 
lersagqss pakegpi/dly pkladtiqtnpi pthlslvds aqas 
pm pvds veadptavgp vlagpvp vdpglvdlas ts selveplpa 
epvlinp vlads aavdpawpisdnlp pvdavpsg papvdlalv 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide . 
location 
cor re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


tieyment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, K=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X*»Unknown, **>Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQLLVESES "" 

LDPPKTIIPEVKEWDSLKIESGTSATTHBARPRPLSLSEYRRR 

RCXJRQAETEERS PQPPTGKWPSLPETPTGIiADI PCLVI PPAPAK 

KTALQRS PETPLEI CLVP VGPS PAS PSPE PPVS KPVASS PTEQV 

PSQEMPLLARPS PPVQS VS PAVPTPPSMSAALP FPAGGLGMPPS 

LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 

CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 

GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 

GPPENVLPIiSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 

KKVSALVOSPQMKALACVSAEGVTVEEPASERliKPETQETRPRB 

KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 

DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 

GN SGG VD I PQE KRPLDR LQ APELANVAGLTP P AT P PHQLWKPLA 

AVSLLAKAIC c !PK < 3TAnPnTT.lfDPnVT'7awirD7Mk - irDT ocnirunnc 
rxvo xm^Ci\3 1 UIMrCU V L i/Aivi t rAA V H 1 H^l r.C ■» ViiCj PS 

RVHVG3GDHDYC\VRSRTPPKK\MPALLIPEVGSRWNVKRHQDI 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCLAPS 
SLLSPEASPCRKDMNTRTPPEPSAKQRSMRCYRKACRSASPSSQ 
GWQGR^GRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PSPRRRSDRRRRYSSYRSHDHYQRQRVIiQKERAIEERRWFIGK 
IPGRMTRSELKQRFSVFGEIEECTIHFRVQGDNYGFVTYRYAEE 
nr nni ci ouniuinynuuy r r UUL. f wXKUi' ^wvaioUljDSNKEDr 
DPAPVKSKFDSLDFDTLLKQAQKNIaRR 


5991 


334 


1379 


RLSSHFSQCSPSIYCXTKFDKQGNVTSFERKKTELYQELGIiQAR 
DLRFQHVMS ITVRNNRI IMRMEYLKAVITPECLLILDYRNUNLK 
QW LFR 3LPSQLSG EGQL VTYPLP FE FRAI E ALLQ YWINT LOG Kt» 

STTjOPTjTLRTr.nZVT/^nDlf JTCG\/nDCVT UTT T nnnirnr nrt nmnr 
" mwruj.ua 2 uvf\uK3Utri\n.aZ* ViJKoJNljMlJjJjyNC»K£»IjSEIjETDI 

. KI FKESILEILDEEELLEBLCVSKWSDPQVFEKSSAGIDHAEEM 

ELLLENYYRLADDLSNAARELRVLIDDSQS 1 1 F INLDSHRNVMM 

RLNLQLTMGT FSLSLFGLMG VAFGMNLES S LE E DH R I FWL I TG I 

MFMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGGIVEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINIj/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS ILGTGLLWLPGGI KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKM FE ATRLLATI VMLL CF I FTLCAALW WH KKGLAVL F C I LQ 
FLSMTWYSLS YI PYARDAVI KCCSSLLS 


5993 


1650 


594 


AEGIX3SWAVWAGLGWAGRHMEAGGATGALGVGCKLPSAFCFPGS 
SVAMDMFQKVE KIGEGTYGWYKAKNRETGQLVAIiKKI RLDLEM 
EGVPSTAIREISLLKELKHPNIVRLLDWHNBRKLYLVFEFLSQ 
DLKKYMDSTPGSELPLHL I KS YTjFQLLQGVSFCHSHRVIHRDIiK 
PQNLL INELGA I KLAO FG LARAFG VP LRTYTH E WTLW YRAPE I 
L.LATRFYTTAVDIWSIGCIFAEMVTRKALFPGDS\EIDQ\LFRI 

frmlgtpsedtwpgvtqlpdykgsfpkwtrkgleeivpnlepeg 

RDLLMQLLQYDPSQRITAKTAIAHPYFSSPEPSPAARQYVLQRF 
RH 


5994"" 


394 


1934 


AGEVQLHVWIRGMRIQPG/KAAAIIDLDPDFEPQSRPRSCTWPL 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPG I LG AVTG PRKGGS RRNAWGNOS YAEL IS OA TF^APFTTPT *p 

LAQIYEWMVRTVPYFKDKGDSNSSAGWKNSIRHNLSLHSKFIKV 
HNBATGKSSWWMLNPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRLSPLRPESEVLAE3IPASVSSYAGGVPPT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPIiHTYSS 
SLFSPAEGPLSAGEGCFSSSQALEALLTSDrrpppPADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLBAPGPSSLVPTL 
SMIAPPPVMASAPIPKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENL ECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEE LHSL\ D P\ RRQELLE ARF \ TGLG VS KG PIiNS ESSNQS L 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreuictea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine. C«Cysteine, D-A©partic Acid, E-= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidinc, I=Isoleucine, lULysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, v=valine, 
N=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I S D YFBRR VEQPLYG LDGSAAKE ATEE QS ALPTLMS VMLAKPRL 
DTEQLAQRGAGLC FT FVS AQQNS PS STG SGNTEHS CS SQKQ I S I 
QHRQT\QSDLTIEKISALENSKNSDLEKKEGRIDDLLRANCDLR 
RQI\DEQQKMLEKYK\ERLNRCFDNEPRNFLIEKSKQEKMACRD 
KSMQDRIiRLGHFlTVRKGASFTEQWTDGYAFQNLIKQQSRINSQ 
REEIERQRKMLAKRKPPAMGQAPPATNEQKQRKSKTNGAENBTL 
TLAE YHEQE E I FKLR LG H LKKE EAE I QAE LERLER VRNLHIR EL 
KRIHNEDNSQFKDHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQLNKNWRDEKKENYHKHACRBYRIHKELDHPRIVKL 
YDY FSLDTDS FCTVLE YCEGNDLDFYLKQHKLMSE KEARS I IMQ 
IVNALKYLNEIKPP1 IHYDLKPGNILLVNGTACGEI KITDFGLS 
KIMDDDSYNSVDGMBLTSQGAGTYWYLPPECFWGKEPPXISNK 
VD V WS VG V I FYQCL YG R K P FGHNQS QQD I LQENTI LKATE VQFP 
PKPWTPEAKAFIRRCLAYRKBDR I DVQQLACD PYLL PH I RKS V 
STSSPAGAAIASTSGASNNSSSN 


5996 
5997 


l£l2 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFG S I VN EG YLNS AS EG E E FC I YNRNPNACS YG VA VGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ANQ WQVS KPKDNPLNEGTDAS PGRPS PES 
FFS I FTWS LTAALAVRRFKDLS FQEEYSTLFP \ ASAQP 




1612 


Sol 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
L FS I WFGS I VNEG YLNS AS EG EE PC I YNRNPNACS YG VAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTG DSC YL \ ANQWQ VS KPKDNP LNEGTDAS PGRP S PFS 
FFS I FTWS LTAALAVRRFKDLS FQEE YSTLFP\ ASAQP 


5998 


1612 


981 


DQQACLIiGLMLTLEFGILEFDPSWIGSWTUK/SWVSWRSRPGCE ' 
LFSI WFGS I VNEGYLNSASEGEEFCI YNRNPNACSYGVAVGVL 
AFLTCLLYLALDVYFPQISSyKDRXK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL\ ANQWQ VS K PKDN PLN EGTD AS PGR P S PF S 
FFS I FTWSLTAALAVRRFKDLSFQEEYSTLPP\ASAQP 


5999 


2 


1790 ! 


RPPMEKARRGGDGVPRGPVLHIVWGFHHKKGCQVEFSYPPLIP 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFG I SC YR\Q IEAKALKVRQAD I TRETVQKSVCVLS KLPL YG 
LLQAKLQL1THAYFEEKDFSQISILKELYEHMNSSLGGASLEGS 
QVYLGLSPRDLVLHFRHKGLI LFKLI LLEKKVLFYIS PVNKLVG 
ALMTVLSLPPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTNLGTIRKVMAGNHGEDAAMKTEEPLFQVEDSS 
KQQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLG SDQTNLFP KDS V PS BS LP ITVQPQANTGQ WLI PG L I SGL E 
EDQYGMPLAI FTJCG YLCLP YMALQQHHLLSDVTVRG FVAGATN I 
LFRQQKHLSDAIVEVEEALIQIHDPELRKLLNPTTADLRFADYL 
VRHVTENRDDVFLDGTG WEGGDEW IRAQFAVY IHALLAATLQLV 
LFR I VNVAKKI GNVMVTT\ SRNVVQTGK\AVGQS VGGAFS \ SAK 
TA\MSSWLSTFTTSTSQSLTEPPDEKP 


£000 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
DD FE DKVFYRTE FQNRE FKATM CNLLAYLKHLKGQNE AALE CLR 
KABELIQ^EHADQASIRSLVTWGNYAWVYYHMGRLSDVQIYVDK 
VKHVCEKFSSPYRI3SPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
LRSAA\ KFYRGKDEPDKAIELLKKALE Y r P \NNAYLHCQIGCCY 
RAKVFQ VMNLR BNGM YG KRKLLE LIGHAVAHLKKADEANDNL FR 
VCSILASLHALADQYEDAE YYFQKEFS KELTPVAKQLLHLRYGN 
FQLYQMKCEDKAIHHFIEGVKINQKSREKEKMKDKLQKIAKMRIi 
S KNGADSEALHVLAFLQELNE KMQQADEDS ERGLESGSLI PSAS 
SWNGE 


6001 


176 


1038 


AFAHSPSRGHKKTHIHTPRHTPRCTMAESHLQSSLITASQFFE1 
WLHFDADGSGYLEGKELQNLIQELQQARKKAGLELSPEMKTFVD 
QYGQRDDGK1GIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
WRKYDTDHSGFIETEELKNFLKDLLEKANKTVDDTKLAEYTDLM 



422 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


rteuictea enu 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signai peptide 
(A=Alanine, C=Cysteine. D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
n-nxscidme, i=isoleucine, K=Lysine, 
L=Leucine, M=Methionine,- N=Asparagine, 
P* Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion) 








liiUj t l>s w WDG KLELTEMAR LL P VQEN FLLKFQG I KMCGKE FNKA 

FELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 
MALSDGGKLYRTDLALILCAGDN 


6002 


977 


81 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLrigHS - " 

SMAQRSDLtELDCQLTRDRWWSHDENLCRQSGLNRDVGSLDF 

EDL PLYKE K LEVY FS PGHFAKGSDRRMVRLEDLFQR F P RTPMS V 

EIKGKNEELIREQ/VLVRRYDRNEITIWASEKSSVMKXCKAANP 

EHPLSPTISRGFWVLLSYYLGLLPFIPIPEKFFFCFLPNIINRT 

YFPFSCSCLNQLLAWSXWLIMRKSLIRHLEERGVQWFWCLNE 

ESDFEAAFSVGATGVITDYPTALRHYLDNHGPAARTS 


6003 


j 140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP - 

A P KTSGNP ANS ARKPGS AGG P KVG AG AS KEGG AGAVDE DDF I KA 

WD VPS IQ I YS SRELEE TLNK IRE ILSDDKHD WDQRANALKK I R 

SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 

KLSTVLGNKFDHGAEAI VPTLFNLVPNS AKVMATSGCAAI RFI I 

RHTH VP R LI PL ITSNC7S KS VP VRRRS FE FLDLLLQE WQTHSLE 

RHAAVLVETIKKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 

NSLEPSYQKSLQTyLKSSGSVASLPQSDRSSSSSQESLNRPFSS 

KWSTANPSTVAGR VSAGSS KASSLPGSLQRSRSDIDVNAAAGAK 

AHHAAGQS VRS GRLG AGALNAG S YASLEDTS D KLDG TAS EDGRV 

RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 

TTALS TVSS G VQR VL VNS AS AQKR S KI PRSQG CS REAS PSRLS V 

ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 

TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVLNTGSDVEEA 

VADALLLGD IRTKKKPAR R RYES YGMHSDDDANS DASS ACSERS 

YSSRNGSIPTYMRQT\EDV\AEVLNRCASSNWSERKEGLLGLQN 

LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 

QVHKDDLQDWLFVLLTQLLKKMC5ADLLGSVOJUCVQKALDVTRES 

FPNDLQFNI LMR FT VDQTQT P SL KVKVAIL KY I ETLAKQMDPGD 

F I NS S ETRLAVSRVITWTTE P KSS DVR KAAQS VL IS LFELNT PE 

FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 

RS PAN WS S PLTS P TNTSQNTL S PS AFDYDTENMNS ED I YS S L^G 

VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 

GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 

PFNKS AL KE AMFDDDADQ FPDDLSLDHSDLVAE LL K E r iS NHNER 

VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 

IRALALKVLREILRHQPARFKNYAELTVMKTLEAHKDPHKEVVR 

SAEEAASV\LATSI\SPEQCIKVLCPIIQTADYPINLAAIKMQT 

KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 

HAVIGDELKPHLSQLTGSKMIOjLNLYIKRAQTGSGGADPTTDVS 
GQS 


6004 


110 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
APKTSGNPANSARKPGSAGGPKVGAGAS KEGGAGAVDEDDFI KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
S LL VAG AAQ YDCFFQHLRLLDGAL KLS AKD LRSQWREACI TVA 
HLSTVI^NKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRL I PLITSNCTS KS VPVRRRS FE FLDLLLQE WQTHS LE 
RHAAVLVETI KKG I HDADAEARVEAR KT YMGLRNH FPGEAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPS TVAGRVS AGSSKAS S L PGS LQRSH S D I D VNAAAGAK 
AHHAAGQSVRSGRLGAGALNAGSYASLEDTSDKLDGTASEDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALSTVS SGVQRVLVNSASAQKRSKI PRSQG CSREAS PSRLS V 
ARSSRIPRPSVSQGCSRSASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVLNTGSDVEEA 
VADALLLGD I RTKKKPARRRYESYGMHSDDDANSDAS SACS ERS 
YSSRNGSIPTYMRQT\EDV\A£VLNRCASSNWSERKBGLLGLQN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDPI 
3 VHKDDLQD WLF VLLTQLLKKMGADLLGS VQAKVQKALD VTRE S 
F PNDLQFNI LMRFTVDQTQTP SLKVKVAI LKY I ETLAKQMDPGD 
PINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
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ID 

NO: 


rreaiCLea 
beginning 
nucleotide 
location 
ccrresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q-Gl'u tannine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








*TMi,LGALPKTFQDGATKLLHNHLRNYGNGTQSSMGSPLTRPTP 
RS P ANWS S P LTS PTNTSQNTLS PS APT) YDTENMNS ED I Y SSLRG 
VTEAIQNFS?RSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFNKSALKEAMFDDDADQFPDDI^LDHSDLVAELLKELSNHKBR 
VEERKXALrELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
I RALALKVLRE I LR HQ PAR FKNYAELT VMKT1E AHKD PH KEWR 
S AEEAAS V\ LATS I \ S PEQCI KVLCP I IQTADYPINLAAI KHQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHLSQLTGS KMKLLNLYI KRAQTGSGGADPTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDALL~ 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPFCKPRDPKIPKS 

KRQKKERMLLCRQLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 

GKKKKKKLGPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLL 

EDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 

KMMMVLGA KWR3 FS TNNP FKG SSG AS VAAAAAAAVA WES MVT A 

TEVAPPPPPVEVPIRKAKTKEGKGPNARRKPKGSPRVPDAKKPK 

PKKVAPL K I KLGGFGS KRKRSSS EDDDLDVESDFDDASINS YS V 

SDGSTSRSSRSRKKLRTTKKKKKGEEEVTAVDGYETDHQDYCEV 

CQQGGEIILCDTCPRAYHMVCIiDPDMEKAPEGKWSCPHCEKEGI 

OWE AKE DNS EGEE I LE E VGGDLEEEDDHHME FCR VCKDGGE LLC . 

CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 

WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 

YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRILNHSVDKKG 

HVHYLIKWRDLPYDQASWESEDVBIQDYDLFKQSYWNHRELMRG 

EEGRPGKKLKKVKLRKLERPPETPTVDPTVKYERQPEYLDATGG 

TLHPYQMEGLNWLRFSWAQGTDTILADEMGLGKTVQTAVFLYSL 

YKEGHSKGPFLVSAPLSTIIN\WEREFEWWAPDMYV\VTYVGDK 

DSRAI IREN'EFS \FEDNAIRGGKKASRMKKEASVKFHVLLTSYE 

LI T I DMAI LGS I DWACL I VD E AHR L KNNQS K F FR VLNG YS LQHK 

LLLTGT PLQNNLEEL FKLLNFLTPER FHNLEG FLE E FAD I AXED 

QIKKLHDMLG \ PHMLRRLKADVFKNM PS KTELI V\ RVELS PM\Q 

K3CYYK\YILHSKFLKALN\AJIGGGNQVSLLNVVMDLKKCCNHPY 

LF P VAAMEAP KM PNGMYDGS AL IRAS GKLLLLQ KMLKNL KEGGH 

RVL I FSQMT KMLDLLED FLE HEG Y K Y ER I DGG I TGNMRQE A I DR 

FNAPGAQQFCFLLSTRAGGLGINLATADTVI I YDSDWNPHNDIQ 

AFSRAHR IGQNKKVMI YR FVTRAS VE ERI TQVAKKKMMLTHLW 

RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 

SSVIHYDDKAIERLLDRNQDETEDTELQGMNBYLSSFKVAQYW 

REEEMGEEEEVEREIIKQEESVDPDYWEKLLRHHYEQQQEDLAR 

NLG KGKR I RKQ VNYNDGSQE DRDWQDDQ S DNQS D YS VASE EGDE 

DFDERS EAPRRPS RKGLRNDKDKPLP PLLARVGGN I EVLGFNAR 

QRKAFLNAIMRYGMPPQDAFTTQWLVRDLRGKSEKEFKAYVSI.F 

MRHLCEPGADGAETFADGVPREGLSRQHVLTRIGVMSLIRKKVQ 

EFEHVNGRWSMPELAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 

NTPAPVPPAEDGIKIEENSLKEEESIEGEKEVKSTAPETAIECT 

QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEEKSAIDLTPIWEDKEEKKEEEEKKEVMLQNGBTPK 

DLNDEKQKKNIKQRFMFNIADGGFTELHSLWQNEERAATVTKKT 

*c<A»*ni\itnijfXiniijUrt\jxXINnvj X i\£\n*JU l\jr*Di?K I AIliNiPFKGEM 

N RGNFLE I KNKF LAR R FK LLEQALV I EEQLRRAA YLNMS ED P S H 
PSMALNTRFAEVECLAESHQHLSKESMAGNKPANAVLHKVLKQL 
EELLSDMKADVTRLPATIAR I PPVAVRLQMSERNI LSRLANRAP 
EPTPQQVAQQQ 


6005 


1 


965 


DNDFLRNT VHRHE P P VTAE P I RLLAEN ED VVWDKP SS I P VH P C 
GRFRHNTVI FILGKEHQLKBLH PLHRLDRLTSGVLMFAKTAAVS 
ER I HEQ VRDRQLE KE Y VCR VEGE FPTE EVTCKEP I L WS YKVG V 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
LGHPILNDPIYNSVAWGPSRGRGGYIPKTNEELLRDLVAEHQAK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=A!anine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fa Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R^Arginine, 
S=Serine, "^Threonine, V-Valine, 
W^Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSLDVLDLCEGDLSPGLTDSTAPSSELGKDDLEELAAAA\QKMB 
EVAEAAFQELDTT ALAS EKA VETDVMNQ \ RQT \ TLCR V P AGATG 
S LAPRPCDVPTCPTL 


6007 


3 


2351 


HE LGQ VE Y V FTDKTGTLTENEMQFRECS I N GM KYQE I NG R LVPE " 
GPTPDSSEGNLSYLSSLSIILNNLSHLTTSSSFRTSPENETELIK 
EHDLFFKAVSLCHTVQINNVQTDCTGDGPWQSNLAPSOIiEYYAS 
S PDE KALVBAAARIGI VFIGNSEETMEVKTLGKLER Y KLLH I LE 
FDSDRRRMSVIVQAPSGEKLLFAKGABSSILPKCIGGEIEKTRI 
HVDEFALKGLRTLCIAYRKFTSKEYEEIDKRZFEARTALQQR\E 
E KLAAVFQF I E KDLI LLG ATAVEDRLQD KVR ET I EALRMAG 1 KV 
WVLTGDKHETAVSVSLSCGHFHRTMNrT.PT.TMnK'QnQCT'a.K'Ar d 
QLARR I TEDHVI QHGL WDGTS LS LALREHE KL FM E VCRNCS A V 
LCCRMAPLQKAKVIRLl KIS PEKPITLAVGDGANDVSM1QEAHV 
G I G I MG KEGRQ AARNSD YA I AR FKFLS KLI^FVHGH FY YI R I ATL 
VQYFFYKNVCFITPQFLYQFYCLFSQQTLYDSVYLTLYXNICFT 
SLP I LI YS LLEQHVDPHVLQNKPTL YRDIS KNRLLS I KTFL YWT 
ILGFS HAFI FFFGS YULIGKDTSLLGNGQMFGNWTFGTLVFTVM 
VIT VTV XMALETH FWT WlNHLVTVfG SIIFYFVFSLFYGGILWPF 
LGSQNM YFVF IQLLSSGS AW FA 1 1 LM WTCL FLDI I KK VFDRHL 
HPTS TE KAQLTETNAG I KCLDSMCC FP EGEAACAS VGRML E R VI 
GRCSPTHISRSWSASDPFYTNDRSILTLSTMDSSTC 


600B 


4554 


1089 


AGVRRAGARRG PGRALFAGATAVP PPSARRRRRCPAPEHAG PAR 

ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 

DFKOFEPNDFYLKNTTWEDVGLMDPS LTKNQD YRTKP FCCSACP 

FSS KF FS A YKSHFRNVHS ED FENR I LLNCP YCTFNAD KKTLETH 

IKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSVEQAVYYCK 

KCTYRDPLYEIVRKHIYREHFQHVAAPYIAKAGEKSIjNGAVPLG 

SNAREESSIHCKRCIiFMPKSYEALVQHVIEDHERIGYQVTAMIG 

HTNVWPRSKPLMLIAPKPQDKKSMGLPPRIGSLASGNV\RSLP 

SQQMVNRLS I PKPNLNSTGVNMMSS VHLQQNNYG VKS VGCGYS V 

GQSMRLGLGGNAP VS I PQQSQS VKQLLPSGNGRS YGLGSEQRSQ 

APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 

ATGPPPGNTSSTQKWKICTICNELFPEMVYSVHFEKEHKAEKVP 

AVANY I MK IHNFTS KCL YCNRYLPTDTLLNHML I HG LS CP YCRS 

TFNDVEKMAAHMRMVHIDEEMGPKTDSTLSFDLTLQQGSHTNIH 

LLVIT YNLRDAPAES VAYHAQNNPPVP PKPQPKVQEKADI P VKS 

SPQAAVPYKXDVGKTLCPLCFS ILKGP ISDALAHHLRERHQVIQ 

TVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVnKTnM 

GQDKTNAPSRLNQSPSLAPVKRTYEQMEFPLLKKRKLDDDSDSP 

SFFEEKPEEPWLALDPKGH\EDDSYEARKSFLTKYFT\KOPYP 

TRRE I E KLAAS LWV \ WK \ SD I AS HFSNKRKKCVRDCE K YK PG VL 

LG FNMKE LNKVKH EMDFOAEG LFENHDE KDS R VNAS KTAD KKLN 

LGKEDDSSSDSFENLEE2SNESGSPFDPVFEVEPKISNDNPEEH 

VLKVI PEDAS ESEEKLDQKEDGS KYETIHLTEEPTKLMHNASDS 

EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDE 

SSQSEDARSSKPAAKKKATMQGDREQLKWKWSSYGKVEGFWSKD 

QSQWKNASENDERLSNPQIEWQNSTIDSEDGEQFDNMTDGVAEP 

MHGSLAGVKLS SQQA 


6009 


4272 


1534 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSIC+GSPSC ' 
HLVLG VLVPVARQSS HSAG PAQS APR *TGTGSGTPKAAEQS GYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQASQRRTVFTAGGGECLGAKSVRASVFTGNQPGVI4GLL 
NGKRGGCFESGYLFGFIVIGKIQSLEAJCVPLPVNGQTGERASPG 
NCRIHIVDAVC*SEHH*DHFLAAAFLENSTIIS*VAPGSWQDHA 
VLQKEVQASVRCRGFESVDTAPAGFWAHSPPGLQGEPTTTSVSL 
F VLAPQDG EG V p FVEGQL VTVLGLWPQS I RHTFVHHTQL FLH P 
I * KLGALDVAFLHLLTLVCS S FNVAYG *GKNGGTTLHQLFAE VN 
AVTRG S A VQRRPS IT I S S I HVDT KI QQELHDVMVAG ADG WQWG 
DPFWGLAGIFHLIDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
ROLLRGGDRGHVWIVLCRLGSLVGGLGTDBLLWFGGR * LI I IG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C~Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, Phenylalanine, G=Glycine, 
II=Hi8tidine, I=»Isoleucine, K=Lysine, 
L=«Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








I * * RGRLSGEWGCGLGRGELFQVSIGIGVS I VHIGQGDHEVLGG 
AGL VE RGALHATGQG VEALVQQ L LD VG P AGALGLCDG AAL FQG P 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWOALHGAAI 
CGVGGAILLKALSQYFLKGG*RLWCARGQ*PVKXRQRRWRG*TR 
R*NGLTIHCFN*LI*GAVCCRLVILRWCGLLBVHGVYGT*IHCL 
GS FPGRLWP * PFI SQERPNGHCQWEFRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LPP FQGACR PRTQRCRTWVCPIAWRQLLAYTRD * 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM " 
AG I SQNAKTGDLPAFGECVGIAS KALCGLTEAAAQAAYLVGI FD 
PNS Q AGHQGLVD P IQ FARANQAIQMACQNLVD PGSS PS Q VLS AA 
T I VAKHTSALCNACR I ASS KTAN PVAKRH F VQ S AKEVANSTANL 
VKTI KALDGDFS EDNRNKCR1 ATAPL I EAVEtf LTAFASNPEFVS 
I PAQI SSEGSQAQEPI LVSAKPMLESS SYLIRTARSLA INPKDP 
PTWSVLAGHSHTVSDSI KSLITS I RDKAPGQRECDYSIDGINRC 
IRDIEQASIAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFE PLI tiAAVGVASKILDHQQQM 
TVLDQTKTLAESALQP4LYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDD IMVT T.NEAASEVGLVGGMVDAI AEAMS KLDEGTPPE P KG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPBELGGIiASQMTSD 
YGHIAFQGQMAAATAE PE E I G FQ I RTR VQDLGHGCI FLVQKAG\ 
ALQVCPTDS YTKR E LI E CARAVTE KVSL VLS ALQAGNKGTQACI 
TAATAVSGI I ADLDTT I M FATAGTLNAENS ETFADHREN I LKTA 
KAL VEDTKLL VSGAAS T P D KLAQ AAQ S S AAT I TQLAE WKLGAA 
S LG S DDPETQ WL I NAI KDVAKALS DL I S ATKGAAS KP VDD PSM 
YQLKGAAKVMVTNVTSLLKT VKA VEDEATKGTRALEATI EC IKQ 
ELTVFQS KDVPE KTSS PEES IRMTKGITMATAKAVAAGNSCRQE 
DVIATANLSR KAVSDMLTACKQAS FHPDVSDE VRTRALRFGTEC 
TI/3YLDL LEHVLVI ZiQ KPT PELKQQLAAFS KRVAGAVTEL I QAA 
EAMKGTEWVDPBDPTVIAETELLGAAAS I EAAAKKIiEQLKPRAK 
PKQADETLDFEEQI LEAAKS I AAATSALVKSAS AAQRELVAQGK 
VGS I PANAADDGQWSQGLI SAARMVAAATSSLCEAANAS VQGHA 
S EEKLI SS AKQVAAS TAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDD VW KTKFVGGI AQI I AAQEEM 
LKKERELEEARKKLAQ.TRQQQYKFLPTELREDBG 


6011 


446 


1635 


LLQPAMRKSPGI*SDCLWAWILLLSTLTGRSYGQPSLQDELKDNT 
TVFTR I LDRLLDGYDNRXiRPGLGERVTEVKTDI FVTS FGPVSDH 
DMEYTI DVFFRQS WKDERLKFKGPMTVLRIiNNLMAS KI WTPDTF 
FHNGKKS VAHNM TM PNKLLR I T3IX3TLLYTMRLT VR \ AEC PMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEWYEWTREPARSVWAED 
GSRLNQYDLLGQTVDSG1VQSSTGEYVVMTTHFHLKRKIGYFVI 
QTYLPCIMTVILSQVS FWLNRESVPARTVFGVTTVLTMTTLS IS 
ARNSLPKVAYATAMDWF1AVCYAFVFSALIEFATVNYFTKRGYA 
WDGKS WP E KPKKVKD PLI KKNNTYAPTATS YTPNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 
GIFNLVYWATYLNREPQLKAPTPHQ 


6012 


351 


5013 


PAELFQSFAIWHKELYDWRLGPWNQCOPVISKSLEKPLECIKGE " 
EG I QVRE I AC IQKD KD I PAE DIICEYFEPK PL LEQACLI PCQQD 
C I VS E FS AWS ECS KTCGSGLQHRTRHWAP PQ FGGSGC PNLTE F 
QVCQSS PCEAEELRYS LHVGPWSTCSMPHSRQVRQARRRGKNKE 
REKDRSKGVKDPEARELIKKKR1TRNRQNRQENKYWDIQIGYQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCSKTCHDMVS PAGTRVRTRTI RQFP IGS EKECPE FEEKEPCLS 
QGDGWPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTALCGGG 
IQTREVYC^OANENLLSQLSTHKNKEASKPMDLKLCTGPIPNTT 
QLCHIPCPTECEVS PWS AWG PCTYENCKDQQGKKGFKLRKRRI T 
NE PTGGSGVTGNCPHLL2AI pce E PACY dw KAVR LGDCE PDNG k 
ECGPGTQVQE WCINS DGEEVDRQLCRDAI FP I PVACDAPCPKD 
CVLSTWSTWS SCSHTCSGKTTEGKQI RARS ILAYAGEEGGIRCP 
NS SALQE VRS CNEH PCTVYH WQTGPWGQCI EDTS VSSFNTrTTW 
NGEASCSVGMQTRKVI CVRVNVG QVG PKKCP ES LR PE TVRPCLL | 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid # E» 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S^serine, T=Tfcreonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


pckkdci vtpy5dwtscps\sckegdss i rkqsrhrvi iqlpan 
ggrdctdplyeekaceapqacqsyrwXkthkwVhrcqXlvpXws 
vqqds p \g aqegcg pg rqara i tcrkqdggqag i heclq yag p v 

PALTQACQ I P CQDDCQiiTS WS K PS S CNGD CGAVRTRKRTIjVG KS 
KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGDI KECGCG YRYQAMACYDQNGRLVETSRCNSHGY I 
EEAC IIPCPS DCKLS EWSNWS RCSKS CGSGVKVRS KWLREKPYN 
GGRPCPKLDHVNQAQVYEWPCHSDCNQYLWVTEPWSICKVTPV 
NMRE^aSEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 
GRSCPNAVEKEPCaSfLNKNCYHYDYNVTDWSTCQLSEKAVCGNGI 
KTRMLDCVRSDGKSVDLKYCEALGI.EKNWQMNTSCMVECPVNCQ 
LSDWSPWSECSQTCGL7GKMIRRRTVTQPFQGDGRPCPSLMDQS 
KPCPVKPCYRWQYGQWSPCQVQEAQCGEGTRTRNISCWSDGSA 
DDFSKWDBEFCADIBLIIDGNKNMVLEESCSQPCPGDCYLKDW 
SSWSLCQLTCVNGEDLGFGGIQVRSRPVI IQEliENQHLCPEQML 
ETKSCYDGQCYEYKWMASAWKGSSRTVMCQRSDGINVTGGCliVM 
SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 
CTL I P VWL PTMEDKRGDV KTS RAVH PTQ PS SN PAGRGRTW FLQ 
PFG PEXSRLKTWVYGVAAGAPVLLI F I VSM I YLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC 
SIVDVEFLPVYHPSPEESRDPTLYANNVQRVMAQALGIPATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
ARPESNDQPGRVCQAATAli 


6014 


2857 


613 


EAVAGGMEKSRMNLPKGPDTLCFDKDEFMKEDFDVDHFVSDCKK 
RV0LEELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLV3M 
DKALNQLSVPLGQLREBVLSLRSSVSEGIRAVDERMSKQEDIRK 
KKMCVLRLI QVI RS VEKI E KI LNSQSSKETS ALEASS PLLTGQI 
LERI ATEFNQLQFHACQS K\GMPLLDKVR PR I AG I TAMLQQS LE 
GLLLEGLQTSDVDI I RHCLRTYATI DKTRDAEALVGQVLVKP YI 
DE V 1 1 EQ FVE SHPNG LQ VM YNKLLE FVPHHCRLLREVTGGA I SS 
EKGNTVPGYDFLVNS VWPQ I VQGLEEKLPSLFNPGNPDAFHEKY 
TISMDFVRRLERQCGSQASVKRLRAHPAYHSFNKKWNLPVYFQI 
RPREIAGSLEAALTDVLEDAPAESPYCLLASHRTWSSLRRCWSD 
EMFLPLLVHRLWRLHSGRFWARYSVFV\N\ELSLRPISNESPKE 
IKKPLVTGSKEPS ITQGNTEDQGSGPSETKPWS ISRTQLVYW 
ADLDKLQEQLPELLEI I KPKLEMIGFKNFSS ISAALEDSQSS FS 
ACVPSLSS K 1 1 QDLSDSC FGFLKSALEVPRL Y RRTNKEVPTTAS 
SYVDSALKPLFQLQSGHKDKLKQAIIQQWLEGTLSESTHKYYET 
VSDVLNSVKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKIRL 
QLALDVEYLGEQIQKLGLQASDIKSFSAIABLVAAAKDOATAEQ 
P 


~S015 

> 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
VKGKGS^PLS AHG I WAWLS RAEWDQVTVYLFCDDHKLQRYALN 
RITVtlRSRSGNELPLAVASTADLIRCKIiLDVTGGLGTDELRLLY 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNIPDWIVDLRHELT 
HKKMPH INDCRKGC YFVLD WLQKT YWCRQLENSLR E TWEL EE FR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAIKAWNNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEOLAALO I E YE ENVDLND VL.VP K P FS O FWnDT.T .DCi r .H CrvN pto 

ALLERMLSELPALGISGlRPrYILRWTVELIVANTKTGRNARRF 
SAGQWEARRGWRLFNCSAS LDWPRM VESCLGS PCWAS PQLLR 1 1 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KS PYTLDS LY WS VK PASS S FG S EAKAQQ QE EQGS VNDVKE E E KE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QESPTAENARLLAQXRGALQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCX\TDTLGL 
\ SCG VGS\ GNCSNSSSSNFRGAFIiLEARGSLH \GL\ KTGLQLF 


6016- 


13 


2237 


ASGCAERRGTEP WELSMSWESGAGPGLGSQGMDIiVWSAWYGKC " " 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


•ttuu-iiu acia segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F^Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M^Methionine, N«Asparagine, 
P-Proline, Q=Glutaraine, R^Arginine, 
S-Serine, ^Threonine, V« Valine, 
W tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSL PLS AHG I WAWLSRAEWDQVTVYLFCDDHKLQRYALN 
RITVWRSRSGNELPLAVASTADLIRCKLLDVTCGLGTDELRLLY 
GMALVR FVH LI S E RKT KFAKV PL KCLAQE VNI PD W I VDLRHELT 
HKKMPEINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEPR 
EG I EEEDQEE DKN I WDDI TEQKPE PQDDG KSTES D VKADGDS JC 
GSEEVDSHCKKALSHKELYBRARELLVSYE2EQFTVLEKFRYLP 

n « c o tri\ v t v LtfuSij Ko V l UlNH EAVLDAFLDDGFLVPT 
FEQLAALQI E YEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F\ KAMGQG LQD E\BQEKLLRICSIYTQSGENS LVOEGS EAS P IG 
KS P YTLDS L YWS VK PAS S S FG S EAKAQQQE EQG S VND VK EEE KE 
BKEVLPDQVEEEEBNDDQEEBEEDEDDEDDEKKDRMEVGPFSTG 
QES PTAENARLLAQKRG ALQGS AWQVSSEDVRWDTFP\ LGRM PR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\scgvgs \GNCSNSSSSNFRGAFLLEARGSLH\GL\ ktglqlf 


6017 


203 




shqeieqnsamaprkrggrgisfifccfrnndhpeityrlrnds " 
nfalotmepalpmppveeldvmfselvdeldltdkhreamfalp 
aekkwqiycskkkdqeenkgatswpefyidqlnsmaarksllal 
ekeeeeers kt i eslktalrtkpmrfvtrfi dldglsci lnflk 
tmdyetsesrihtsligci kalmnnsqgrahvlahses inviaq 
s ls ten i ktkvavle i lgavclv pgghkkvlqam lh yq kyaser 
tr fqtlindld ks tg r yrd e vs lkta i ms f i n avls qgagves l 
d frlh lry e \ flmlg i h p vmdk lrkhens tldrhldf femlrne 
dele fakrfelvhidtksatqkpeltrkrlthseayphfms ilh 
h clqmp ykr sgntvq y wllldr 1 i qqi viqndkgqdpds tpl en 
fnikkvvrmlvnenevkqwkeqaekmrkehnelqqklekkerec 
daktqekeemmqtlnkmkeklekettehkqvkqqvaeltaqlhe 
lsrravcas i pggpspgapggpfpss vpgsllppppppplpggm 
lpppppplppggpppppgppplgaimpppgapmglalkkksipq 
ptnalks fnws kl penklegtvwte i ddtkvfkildledlertf 

i» a x y Kyy uff vn SNS KQK EADAI DDTLS S KLKVKELS V I DGRRA 
QNCNILLSRLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFVPE 
KSDIDLLEEHKHELDRMAKADRFLFEMSRINHYQQRLQSLYFKK 
KFAERVAEVKPKVEAIRSGSEEVFRSGALKQLLEVVLAFGNYMN 
KGQRGNAYGFKISSLNKIADTKSSIDKNITLLHYLITIVENKYP 
SVLNLNEELRDI PQAAKVKMTELDKE ISTLRSGLKAVETELEYQ 
KSQPPQPGDKFVSWSQPITVASFSFSDVEDLLAEAKDLFTKAV 
KHFGEEAGKIQPDEFFGIFDQPLQAVSEAKQENENMRKKKEEEE 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKR ITNQMTDS SRERPI TKLNF 


6018 
6019 


13 
2 


2510 
1066 


TISQSGGI RRRREAVWFE WNMDFSRLHM YS PPQCVPENTGYT Y 

ALSS S YS SDALD FE TEHKLD P VFDS PRMS RRSLRLATTACTLGD 

GEAVGADS GTS SAVS LKNRAARTTKQRRS TNKSAFS INHVS RQ V 

TSSGVSYGGTVSLQDAVTRRPPVLDESWIREQTTVDHFWGLDDD 

GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNCNMLSERKDVLT 

AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 

ACAGY FLLQ I LRR I GAVGQAVS RTAWS ALWLA WAPG KAASGVF 

WWLGIGWYQFVTLISWLNVFLLTRCLRNICKFLVLLIPLFLLLG 

LSLRGQG\NFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 

P LQGDS EAFP WHWMSG VEQQ VAS LSGQ CHHHGENLR ELTTLLQK 

LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 

HLEDILGKLREKSEAIQKELEQTKQKTISAVGEQLLPTVEHLQL 

ELDQLKSELSSWRHVTCroCETVDAVQERVDVQVREMVKLLFSED 

QQGG S LBQLLQR FSSQ F VS KGDLQTMLRDLQLQ ILRNVTHHVS V 

T KQLPTS EAWS AVS E AG ASG I TEAQARAI VNS ALKL YSQDKTG 

MVDFALBSGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 

WIQPDIYPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHIPKTL 

SPTGNISSAPKDFAVYGTiENEYQKEGQLI*GQFTYDQDGESLQMF 

QALKRPDDTAFQ I VELR I FSNWGHPE YTCLYRFRVHG EP VK 

TPNDREPPPQRPPSSRRASHIiAQEITSAASLGDQTQIl^SLTTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corre s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" - 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E«= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I«Ieoleucine, ' K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine f 
S=Serine, T= Threonine, V= Valine, 
M=Tryptophan, Y=Tyrosine, X= Unknown, *:=stop 
Codon, /^possible nucleotide deletion, 
\=pos8ible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPWWNSASVAAPAPA 
QSU3VQAVTPQLLLNAQGQVIATLASSPLPPPVAVRK\PSTPES 
LLKSEVOPIKPTPTVPQPAWIASPAPAAKPSASAPrPITCSBT 
PTVSQLVSKPHTPSLDEDGINLEEIRBFAKNFKIRRLSLGLTQT 
QVGQALTATEGPAYSQSAICRFEKLDITPKSAQKLKPVLEKWLN 
EABLiRNQEGO^NLMEFVGGEPSKKRKRRTSFTPQAIEALNAYFE 
KNPLPTGQEITEIAKELNYDREWRVWFCNRRQTLKNTSKLNVF 
QIP 


6020 


4 953 


549 


EAIQFEVSXGNYGNKFDTTCKPLASTTQYSRAVFDGWYYYYLPW 

AHTKPVVTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 

QGKIPANQLAELWLKHDEVIEDTRYTLPI.TKGKANVTVLDTQI 

RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 

EPQNSKPD 1 1 1 WM I RGE KRLAYAR I PAHQVLYSTSGENASG KYC 

GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 

GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 

FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 

GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 

AVDEKGWE YGI T I PPDHKPKS WVAAEKMYHTHRRJRRIiVKKRKKD 

LTQTA3STAGAMEBLQDQEGWEYASLIGWKFHWKQRSSDTFRRR 

RWRRKMAPS ETHGAAAI FFCLEGALGADTTEDGDEKSLEKQFCHSA 

TTVFGANTPIVSCKFDRDYIYHLRCYVYQARNLLALDKDSFSDP 

YAHICPXHRSKTTE11HSTLNPTWDQTIIFDEVEIYGEPQTVLQ 

NPPKVIMBLFDNDQVGKDEFLGRS I FSPWKLNSEMDITPKLLW 

HPVMNGD KACG DVLVTAEL I LRG KDGS NLP I LP PQRAPNL YMV P 

QG 1 R PVVQLTAIEILAWGLRNMKNFQMAS I TS PS L WECGGERV 

E3 WI KNLKKTPNF PSSVLFMKVFL PKEEL YMP P L VI KV I DHRQ 

FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 

D I V I EMEDTKPLLAS KCLS SMSTALS KMAS PAT VHLTE KEEE I V 

DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQFREL PDS VPQECTVR I Y I VRGLELQ PQDNNGLCDP Y I K I TLG 

KKVI E \ DRDHY I PNTLNP VFGRMYEIjS CYLPQEKDLKIS V YD YD 

TFTRDEKVGETIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 

RDS LR \ PTQ \ LLQNVARFKG FPQPILSEDGSRIR YGGRD YS LDE 

FEANKILHQHIiGAPEERLALHILRTaGLVPEHVETRTlHSTFQP 

NIS\RYYLRVIIW^fTKDVILDEKSITGEEMSDIYVKGWIPGNEE 

NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 

SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 

HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKG^ 

PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 

KtiDLPNRPETSFliWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 

LFVAVLLYSLPNYLSMKIVKPNV 


| 6021 


4953 


549 


EAIQFEVSlGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYIiPW 
AHTKP WTLTSYWEDI SHRLDAVNTLLAMAERLQTNIEALKSG I 
QGKI PAJNQLAELWLKLIDEVI EDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDIUWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GXTQT I FLK YPQE KNNGP KVP VELRVNI WLGLSAVE KKFNS FAE 
GT FTVFAEMYEKQALM FG KWGTSGLVGRHKFS DVTG KI KLKRE F 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGV/EYGITI PPDHKPKS WVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAKEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPS ETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGANTPI VS CNFDRDYI YHLRCYVYQARNLLALDKDSFSDP 
YAHICFLHRSKTTEIIHSTLNPTWDQTIIFI3EVEIYGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRS I FSPWKliNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELILRGKIX3SNLPILPPQRAPNLYMVP 
QG IRPWQLTAI E ILAWGLRNMKNFQMAS I TSPSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEBLYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide - 
(A=Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine. G=Glycine, 
H=Histidine, I=*Isoleucine, K=Lysine, 
L= Leucine, M=Nethionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine f V«Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *=3top 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








DIVIEMBDTKPLIJ^KCbSSMSTALSKMASPATVHLTEKEEEl^ - 

DWWSKFYASSGEHEKCX5QYIQKGYSKLKIYNCELENVAE7EGLT 

DFSDTFKLYRGKSDENEDPSVVGEFKGSFRIYPLPDDPSVPAPP 

RQFRELPDSVPQECTVRIYIVRGLELQPQD1INGLCDPYIKITLG 

KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 

TFTRDEKVGETIIDLEKPF\LSRFG\SHCX3\IPEEYCVSGVNTW 

rdslr\ptq\llqnvarfkgfpqpilsedgsriryggrdyslde 

FEANKIIiHQHUSAPEERLALHILRTQGLVPEHVETRTLHSTFXJP 

n t is\ryylrviiwktkdvildeksitgebmsdiyvkgwipgneb 
nkqktdvhyrsldgegnfnwrfvfpfdylpaeqlcivakkehfw 
sidqtefrippr\liiqiw\dndkfs\lddylgfprtltcrhti 
hflqkspggnc/rgldmipdlkamnplkaktaslfeqksmkgvjw 
pcyaekdgarvmagkvemtleilnekbaderpagkgrdepnmnp 
kldlpnrpbts flwftnpcktkkfi vwrrfkwvi igllflli ll 
lfvavllyslpnylsmkivkpkv 


6022 


4953 


549 


eaiqfevsignygnkfdttckpjlasttqysravfdgnyyyylpw 
ahtkpvvtltsywedishrldavntllamaerlqtniealksgi 
qgki panqlae lwlkli devi edtr ytlpltegkanvtvldtq i 

RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNSMPDIIIWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 

gktqtiflkypqeknngpkvpvelrvniwlglsavekkfnsfae 
gtftvfaemyenqalmfgkwgtsglvgrhkfsdvtgkiklkref 
flppkgwbwegewivdpersllteadaghteftdevyqnesryp 
ggdwkpaedtytdangdkaaspseltcppgweweddawsydinr 
avdekgweygiti ppdhkpkswvaaekmyhthrrrrlvrkrkxd 
ltqtasstagameelqdqegweyasligmkfhwkqrssdtfrrr 
rwrrkmapsbthgaaaifklegalgadttedgdekslekqkhsa 
ttvfgantpivscnfdrdyiyhlrcyvyqarnllaldkdsfsdp 
yah! cflhrskttei ikstlnptwdqti i fdevei ygepqtvlq 

MPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
H PVMNGDKACGDVLVTAELI LRGKDG S NL PI LP PQRAPN L YMVP 
QGIRPWQLTAIEILAWGLRNMKNFQMASITSPSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
reRKPWGQCTI ERLDRFRCDPYAGKEDI VPQLKAS LLSAPPCR 
D I VI EM EDTKPLLASKCLSSMSTALS KMASPATVHLT3KEEE I V 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSD3NEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPD3VPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVI E \ DRDH Y I PNTLNP VFGRM YELS CYLPQEKDLKI S VYD Y D 
TFTRDEKVGETIIDLENPF\LSRFG\SHCG\IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANK I LHQHLGAP E E RLALH ILRTQGLVP EHVETRTLHST FQ P 
NIS\RYYLRVIIWNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVH YRSLDGEGN FNWR FVFPFD YL P AEQLC I VAKKEH F W 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKE)GARVKAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKMVIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


: 916 


SQEIiGMFVELNNLLNTTPDRAEQGKLTLLCDAKTDGSFLVHriFL 
SFYLKANCKVCFVALIQSFSHYSIVGQKLGVSLTMARERGQLVF 
ucuu/ j. v^ov>K \vt UAyKbFWFJjQrLKbANAGNLKPLFEFVREA 
LKP VDSG EARWTYP VL LVDDLS VLLS LGMGAVAVLD F I H YCRAT 
VCWE LKGNMWLVHDSGDAEDEEND I LLNGLS HQS H L I LRAEG L 
ATG FCRD VHGQLR I LWRRPSQPAVHRDQS FT YQYKIQDKS VS FF 
AKGMSPAVL 


6024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQLNELELLMEKSFWEEAE 
L PAE L FQ KKWAS FPRTVLSTGMDNRY LVLAVNTVQNKEGNCE K 
RL VI TASQSL ENKE LC I LRND WCS VP VEPGD I IHLEGDCTS DT W 
1 1 DKD FG YL I L YPDML I SGTS IAS S I RCMRRAVLS ETFRS S DPA 
TRQMLIGTVLHEVFQKAINNSFAPEKLQELAFQTIQEIRHLKEM 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D-Aspartic Acid, E*= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=IIistidine, I=Isoleucine, K=Lysine, 
L=Leucine. M= Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
w=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




» 




YRbNLSQDEIKQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSL 
PS DNS KDNSTCN IEWX PMD I EES I WS PR FGLKG KI D VTVG VK I 
HRGYKTKYKIMPLELKTGKESNSIEHRSQWLYTLLSQERRADP 
EAGIiLL YIiKTGQM Y P VPANHLDKR ELLKLRNQMAFS L FH R I S KS 
ATRQKTQLASLPQI IEEEKTCKYCSQIGNCALYSRAVEOQMDCS 
SVPIVMLPKIEEETQHLKQTHLEYFSLWCLMLrLESQSKDNKKl'J 
HQN I WLMPAS EMEKSG S CIGNL I RMEHVK I VCDGQYLHN FQC KH 
GAIP\rmUIAGDRVIVSGEERSLFALSRGYVKEINWTTVTCIJ*D 
RNbS VLP ES TL FRLDQEE KNCD I DT P LGNLS KLMENT F VS KKL R 
DLIIDFRBPQFISYLSSVLPHDAKDTVACIL.KGLNKPQRQAMKK 
VLLS KDYTLI VGMPGTGKTTTI CTLVRILYACGFSVLLTS YTHS 
AVDNIbLKIAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 
KS \ LALLEEL YTSQLI DATTCMG INH P I FSRKI FD FC I VDE AS Q 
1SQPICLGPLFFSRRFVLVGDHQQLPPLVLNREARALGMSESLF 
KRL EQNKSA WQliTVQ YRMNSK I MS LSNKLT Y EGKLE CGS DKVA 
NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 
KVPAPEQ VEKGG VSN VTEAKLI VFLTS IFVKAGCS PS DIG 1 1 AP 
YRQQLKI INDIiLARS IGMVEVNTVDKYOD\RDKS I VLVS FVRSN 
KDGTVGELLKDWRRT»NVAITRAKHKLILLGCVPSLNCYPPLEKL 
LNHLNSEKLI IDLPSREHESLCHILGDFQRE 


6025 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA 

ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 

GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWSYADTELIP 

AACGATLPALGLRSSAQDPQAVLGALGRALSPLEEMLRLHTYLA 

GE APTLADLAAVTALLLP FR Y VLDP PARR I WNNVTRWF VTCVRQ 

PEFRAVLGEWLYSGARPLSHQPGPEAPALPKTAAQLKKEAKKR 

EKLBKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 

PGEKKDVSG PM PDS YS P RYVEAAW YP W WEQQGFFK PE YGRPNVS 

AANPRGVFMMC I P P PNVTG S LHLGHALTN A IQDS LTR WHRMRGE 

TTLWNPGCDHAGIATQVVVEKFCLWREQGLSRHQIiGREAFLQEVW 

KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 

LHEEG 1 1 YRSTRLVNWSCTLNS AI SDI E VDKKELTGRTLLS V?G 

YKEKVEFGVLVS FAYKVQGS DS DEE VWATTR I ETM LGD VA VAV 

HPKDTRYQHLKGKNVIH PFLSRSLPI VFDE FVDMDFGTGAVKIT 

PAHDQND YE VGQR HG LEA I S I MDSRGAL INVPPP FLGL PR FEAR 

KAVLVALKERGLFRG I EDNPM WPLCNRSKDWEPLLRPQWYVR 

CGEMAQAASAAVTRGDLRIIiPERHQRTWHAWMDNIRE\WCMFPG 

KLWWG\HR\IPAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 

XAAKE FG VS PDKI SLQ QDE D VLDTW FS S GLFPLS I LG WPNQS ED 

LSVFYPGTLLETGHDILFFWVARMVMLGLKLTGRLPFREVYLHA 

I VRDAHG RKMSKS LGNVID P1»D VI YG I SLQGLHNQLLNSNIiDPS 

EVEKAKEGQKADFPAGIPECGTDALRFGbCAYMSQGRDINLDVN 

RILGYRHFCNKLWNATKFALRGLGKGFVPSPTSQPGGHESLVDR 

WIRSRLTEAVRLSNQGFQAYDFPAVTTAQYSFWLYELCDVYLEC 

LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 

RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALEIiAbSITRA 

VR P \ LRADYNLHPES GPTCFL E VAD\ EATGALASAVSG YVQG PG 

QAQVWAVAEPWGLPAP\QGCAVALASDRCSI\HLQLQG\LLDP 

AREbG\KLQ\AKRVEAQ\RQAQ\RLR\ERRA\ASGNPVKVPL\E 

VQEADEAKLQQTEAELRKVDEAIALFQKML 


6026 


2674 


514 


G P I TFLKKKAKM KDMP LRI H VLLGLAITTLVQAVD KKVDCPRLC " 

TCEIRPWFTPRSIYMEASTVDCNDLGLLTFPAKI.PANTOTi.r.TO 

TNNIAKIEYSTDFPVNLTGIiDLSQNNLSSVTNINGKKMPQLLSV 

YLEENKLTELPEKCLSELSNLQEbYINHNLLSTISPGAFIGLHN 

LLRLHLNSNRLQM INS KW FDALPNLEILM IGEN P 1 1 R I KDMNFK 

PLINLRSLVIAGINLTE I PDNALVGLENLES ISFYDNRLIKVPH 

VALQKVVNLKFLDLNKNPINRIRRGDFSNMLHLKELGINNMPEL 

ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESLKL 

NSNAbSALYHGTlESLPNLKEISIHSNPIRCDCVIRWMNMNKTN 

IRFMEPDS LFCVDPPE FQGQNVRQVHFRDMMEICLPL I APESFP 

SNLNVEAGSYVSFHCRATA\EPQPEIYWITPSGQKLLPNT\LTD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

Amino ar i H 

sequence 


Ammo acid, segment containing signal peptic!^ 
<A=Aianine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, P-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R^Arginine, 
S*Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosin e , X= Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion) 








KFYVHSEGTLDINGVTPKEGQLYTCIATNLVGADLiCSVMiKVDG 
SFPQDNNGSLNIKIRDIQANSVLVSWKASSKILKSSVKWTAPVK 
TENSHAAQSARIPSDVKVYNLTHLNPSTBYKIC1D1 PTIYQXNR 
KKCVNVTTKGLHPDQKEYEKNNTTTLMACLGGLLGIIGVICLIS 
CLSPEMNCDGGH S YVRNYLQKPTFALGEL YPPL I NLWEAG KE KS 
TSLKVKATVIGLPTNMS 


.6027 
6028 


5254 


H XH 0 


GGRRAPG R PGRS I KDE EE ETVFRE WS FS PDPLP VRYYDKDTTK 
PISFYLSSLEELIiAWKPRJJEDGFNVALEPLACROPPLSSQRPRT 
LLCHDMMGGYU3DRFIQGSWQTPYAFYHWQC1DVFVYFSHHTV 
TIPPVGWTNTAHRHGVCVIK3TFITEWNEGGRLCEAFLAGDERSY 
QAVADRLVQIT\RFFRFDGWLINlENSLSliAAVGNMPPFLRYIiT 
TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 
FTN YNWRE EHLERMLGQAG BRRAD VYVG VD VFARGNWGGRFDT 

DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
DPVAliRNRCPAPAKLCPH 




120 


3432 


NCLLLQAKGt'HGEIEDLQQWLTDTERHLLASKPLGGLPEtAKEQ 

LNVHI^VCAAFEAKEETYKSLMQKGQQMIaARCPKSAETNIDQDI 

NNLKEKWESVETKLNER\KT\KLBEALNLA\MEFHNSL\QDFIN 

WLTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQII 

ELDKTGTHLKYFSQKQDWLIKNLLISVQSRWEKWQRLVERGR 

SLDDARKRAKQFHEAWSKLMEWLBESEKSLDSELEIANDPDKIK 

TQLAQHKEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADDNLKLD 

DMLS3LRDKWDTICGKSVERQNKLEEA\LLFSGQFTDALQALID 

WL YRVEPQLAEDQP VHGD I DL VMNL I DNHKAFQ KE LG KRTSS VQ 

ALKRSARELIEGSRDDSSWVKVQMQELSTRWETVCALSISKQTR 

LEAAI^RQAEEFHSVVHALLEWLAEAEQTIiRFHGVLPDDEDALRT 

LIDQHKEFMKKLEEKRAEI^KATTMGiyrVIAICHPDSITTIKHW 

IT! I RARF E E VLAWAKQHQQRLAS A LAGL IAKQELLEALUAWLQ 

KAETTLTDKDKEVI PQE I EEVKALI AEHQTFMEEMTRKQPDVDK 

VTKTYKRRAADPSSLQSHIPVLDKGRAGRKRFPASSLYPSGSQT 

QIETKNPRVNLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 

NFDFDIWRKKYMRWMNHKKSRVMDFFRRIDKDQDGKITRQEFID 

GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 

YKPITDADKIEDEVTRQVAKCKCAKRFQVEQIGDNICYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFI>VKNDPCRAKGRT 

NKBLREKFILADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 

SQAAQAASPQVPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGEDSGLITTAA 

ARVRTQFADS KKT PS R PGS RAGS KAGS RAS S RRGS DASD FD I S E 

IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
LDKSSKR 


6029 


1 


3533 

■ 

1 


IMPCGSSRLLRGCWTHPNEPVSOLSYFDCIESVMEKfSKVLGESM 
AGISQNAKTGDLPAFGECVGIASKALCGLTEAAAQAAYliVGIFD 
PNSQAGHQGLVDP IQFARANQAIQMACQNLVDPGS SPSQVLSAA 
TIVAKHTSALCNACRIASSKTANPVAKRHFVQSAKEVANSTANL 
VKTI KALDG DFS E DNRNKCR I ATAPL I EAVENLTA FASNPE F VS 
I PAQ I S S EG SQAQEP I LVS AKP NILE S S S YLIRTAR S LAI NPKDP 
PTWSVLAGHSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
IATAARGEAAQLGHKGTQLASYFEPLILAAVGVASKILDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDDI M VTLMEAAS E VGLVGG MVDA I AEAMS KLDEGT P PE PKG 
T FVDYQTT WKYS KAI AVTAOEMMTKS VTNPFFT fifiT .a cdmtc r» 
YGHLAFQGQMAAATAEPEEIGFQIRTRVQDLGHGCIFLVQKAG\ 
AIiQVCPTDSYTKRELI ECARAVTEKVSLVLSALQAGNKGTQACI 
TAATAVSG 1 1 ADLDTT I MFATAG TLNABNS ET FADH RE NT LKTA 
KALVEDTKLLVSGAASTPDKLAQAAQSSAATITQLAEVVKLiGAA 
9LGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 

5f qlkgaakvmvtnvts llktvkave DEATRGTRALBAT I eci kq 

SLTVFQS KDVP EKTS S PEES I RMTKG I TMATAKAVAAGNS CRQB 
^VIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRAIiRFGTEC 
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SGQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CaCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLGYLDLLkHVLVI LQKPTPELKQQLAAFS KRVAGAVTELIQAA 
EAMKGTEWVDPEDPTVI AETE LLGAAAS I EAAAKKLEQLKPRAK 
P KQADETLDFEEQI LEAAK5 I AAATS AL VKSASAAQR ELVAQG K 
VGSIPANAADDGQWSQGLISAARMVAAATSSLCEAANASVQGHA 
S EE KI*IS SAXQVAAS TAQ LLVAC KVKADQDS E AMRRLQAAGNAV 
KRASDNLVRAAQKAAFGXADDDDVWKTKFVGGIAQIIAAQEEM 
LKK2RELE EARKKLAQ I RQQQ YKFLPTELREDEG 


6030 


3 


1777 


FPGRGSPALQLEVLI CLGLMGLERALNVLAPI FYRNI VNLLTEN 
APHNSLAWTVTSYVFLKFljQGGGTGSTGFVSNLRTFLWIRVQQF 
TSRRVELLIFSHLHELSLRWHLGRRTGEVLRIADRGTSSVTGLL 
SYLVFNVIPTLADIIIGIIYFSMFFNAWFGLIVFtOISLYLTLT 
IVVTEWRTKFRRAMOTQENATRARAVDSLLNFETVKYYNAESYE 
VERYREAI I KYQGLEWKSSASLVLLNQTQNLVIGLGLLAGSLLC 
AYFVTEQKLQVGDYVLFGTYIIQLYMPLNWFGTYYRMIQTNFID 
MENMFDLLKK\2TEVKDLPGAGP FRFQKGRI EFENVHFS YADGR 
BTLQD VSFTVMPGQTLALVGPSGAGKSTI LRLLFRFYDI SSGCI 
RIDGQDISQVTQALFRFSHWELCPKDTVLFNDTIADNIRYGRVT 
AGNDS VEAAAQAAG IHDAIMAF PEGYRTQVGERGLKLSGGEKQR 
VAIARTILKAPGIILLDEATSALDTSNERAIQASLAKVCANRTT 
I WAHRLST WNADQILVI KDGC I VERGRHBALLS RGGVY ADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 


1694 


LRMS ENLDKS NVNEAG KS KS NDS E EGLEDAVEG ADE ALQ KA I KS 
DSSS PQRVQRPHSSPPRFVTVEEIABTARGVTNMALAHEI WNG 
DFQIKPVELPENSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
IKLVGEI KETLLS FLLPGHTRLRNQITEVLDLDLI KQEAENGAL 
DISKLAEFIIGMMGTLCAPARDEEVKKLKDIKEIVPLFREIFSV 
LDLMKVDMANFAISS IRPHLMQQS VEYERKKFQEI LERQPNSLD 
FVTQWLEEASEDLMTQKYKHALPVGGMAAGSGDMPRLSPVAVQN 
YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQLTILGAV 
LLVTFSMAAPGISSQADFAEKLKMIVKILLTDMHLPSFHLKDVL 
TTIGEKVCLEVSSCLSLCGSS PFTTDKETVLKGQIQAVAS PDDP 
IRR 2 MES RILTFLETYLASGHQKPLPTVPGGLS PVQR ELBE VAI 
KFARLVNYNKM VFCP YYDAI LS KILVRS 


6032 


39 


2415 


aarlcraqptksawmirdls km ypqtrhpaphqpaqp fkft i se 
scdrikeefqflqaqyhslkleceklasektemqrhyvmyyems 
yglni emhkqabi vxrlnai caqvi p flsqehqqqwqaverak 
qvtmaelnaiigqqqlqaqhlshghglpvpltphpsglqppaip 
pigssagllalssalggqshlpikdekkhhdndhqrdrdsikss 
svs p s as frg ae khrnsad y s s es kkqkteeke iaar yds dge k 
sddnlwdvsnedpssprgspahsprengldktrllkkdapisp 
asiasssstpsskskblslneksttpvsksntptprtdaptpgs 
nstpglrpvpgkppgvdplasslrtpmavpcpyptpfgivphag 
mngelts pgaayaglhni spqmsaaaaaaaaaaaygrs pwgfd 
phhhmrvpai p pnltg i pgg kp ays fhvsadgqmq p vp fp pdal 
igpgiprharqintlnhgewoxvtisnptrhvytggkgcvkvw 
dishpgnkspvsqldclnrdnyirscrllpdgrtlivggeastl 
siwdlaaptprikaeltssapacyalaispdskvcfsccsdgni 
avwd lhnqt l vrq fqghtdgas c i d i sndgtklwtggldntvrs 

W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCX3KWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 

KJ\X VIAV1I 


6033 


39 


2415 


AARLCRAQPTKSAWMIRDLS KMYPQTRHPAPHQPAQPFKFTISE 
SCDRIKEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLNIEMHKQAEIVKRLNAICAQVIPFLSQEHQCX}WQAVERAX 
QVTMAELNAIIGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAIP 
P IGSSAGLLALSSALGGQSHLPI KDEKKHHDNDHQRDRDS IKSS 
S VS PSAS FRGAEKHRNS ADYS S E S KKQKTE E K3 1 AAR YD S DGE K 
SDDmiWDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pt*aH \ #»H cam"! 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D-Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidinc, I=Isoleucine , K=Lysine, 
^Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, ^Threonine, v=valine, 
W=Tryptophan, Y^Tyrosine, XaUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGBLTSPGAAYAGLHNI S PQMS AAAAAAAAAAAYGRS PWGFD 
PI M IMR VPA I PPNLTG I PGG KP A YS FHVSADGQMQP VP FP PCLA L 
IGPGIPRHARQIKTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
D I SH PGNKS P VSQLDCLNRDNY I RS CRL LPDGRTLI VGGE AS TL 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCPSCCSDGNI 
AVWDIiHNQTLVRQFQGHTDGASCIDISNIX5TKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYOLHLHESCVLSLKFAHCGKWF\VSTGKDNH*NA 
W\RTPYG\ASIF\QSK2SSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


E SGRR RR LKKRRS P C PGTAGG PG ETN PG PG AC PRGP REEAAAAM 
E I APQEAPP VPGADGD I EEAPAEAGS PS PASPPADGRLKAAAKR 
VTFPSDEDIVSGAVEPKDPWRHAQNVTVDEVrGAYKQACQKLNC 
RQIPKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLECJTNLDEDGASALFDMIEYYESATHLNISFNKHIGT 
RGWQAAAHMMRKTSCLQYL\DARNTPLLDHSAPFVARALRIRSS 
LAVLhXENASLSGRPLMLLATALKMNKNIiRELYL\ADNKLNGLQ 
DSAQLGNLLKFNCSLQILDLRNNHVLDSGLAYICEGLKEQRKGL 
VTL\VLWNNQLTHTGMAFLGKTl»PHTQSLETLNtiGHNPIGNEGV 
RHLKNGLISNRSVLRLGLASTKLTCEGAVAVAEFIAESPRiLRL 
DLR ENE I KTGGliMALS LALKVNH S LL RLDLDR E PKKEAVKS F I E 

TQKALLAEIQNGCKRNLVLAREREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGEREET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLLEASQESGQETL 


6035 


19 


404 


SVTYLGI I LHKNTGALPADPVQLI SQTPTP&TKQQLLS FLGM VG 
YF YLWI PG FAILTKPIjCKLTKENLADAI DP KSFS HS S FRS LKTA 
LENASTLALPDSSQPF\SLHTABVQGCWEILTQGLGPLPV 


6036 


1745 


3*6 - 


LPDVEKLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
SRGGQGRGVEKPPHLAALILARGGSKGIPLiailKHLAOVPLlGW 
VLRAALDSG AFQS VW VSTDHDE I BNVAKQ FGAQVHRRS S E VS KD 
SSTSLDAIIBFLNYHNEVDIVGNIQATSPCLHPTDLQKVAEMIR 
EBGYDSVFSWRRHQFRWSEIQKGVREVTBPLNLNPAKRPRRQD 
WDGEL Y ENGS FYFAKRH L I EMG YLQGG KMA YYEMRAEHS VD I DV 
D I DWP IAEQRVLR YG YFGKEKLKE I KLLVCN1DGCLTNGH I YVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 
LDCKME VS VS DK LAWDE WR KEMG LC WKEVAYLGNEVSDE ECLK 
RVG LS GAPADACSTAQKAVG Y I CKCNGGRGA\ I RE FAEH I C \ LL 
MEKGLINFMPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTSWWMSSVLTILLFSLQGNKMLNYSAPSAGGYLLPRKPVGTPA 
GGGFP RRHSVTLPSS KFRQNQLLS S L KGE PAPAL S S RDS RFRDR 
SFSEGGERLLPTQKQPGGGQVNSSRYKT\ELCRPFEENGACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 
HNAE E RRALAGARDLS ADR PR LQHS FS FAGFPS AAATAAATGLL 
DSPTSITPPPILSADDLLGSPTLPDGTNNPF\AFSSQELASLFA 
PSMOLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSLSDQEGYLS 
SSS3SHSGSDS PTLDNSRRLP I FSRLS I SDD 


6038 


1450 


426 


SSALQEFGTRNHTFGVPLPHRRKQIISCNICQLRF^SDSQAAAH 
YKGTKHAKKLKALEAMKNKQKSVTAKDSAFCTTFTSITTNTINTS 
SDODGTAGTPAISTTTTVEIRKSSVMTTEITSKVEKSPTTATG 
NSSCPSTETEEBKAKRLL\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLE ARNGSGTI KAFPRAGVKGKGPVNKGNTGLQNKTFHCE I CD 
VHVNSETQLKQHlSSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFSKEPSKPLAPRILPNPLAAAAAAAAVAVSSPFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY 


6039 


4073 r 


1000 


LDEYEARLTLAKLDDFEEDNEDDDENRVNQEEKAAKITELINKL 
NFLDEAEKDLAT VNSNPFDDPDAAELNP FGD PDSEEP ITETAS P 
RKTEDS FYNNS YNPFKE VQTPQYLNPFDEPEAFVTI KDSPPQST 
KKKNIRPVDMSKYLYADSSKTEEEBLDESNPFYEPKSTPPPNNL 
VNPVQELETERRVKRKAPAPPVLSPKTGVLNENTVSAGKDLSTS 
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SEQ 

Tn 
1U 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


"I Predicted end 

I nucleotide 

1 location 

1 corresponding 
to first 
amino acid 
residue of 
amino acid 

1 sequence 


| Amino acid segment containing signal DeDtid^" 
(A^Alanme, C=Cyeteine, D-Aspartic Acid, K. 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, Ksbysine, 

j L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknovm, *«Stop 
Codon, /=possible nucleotide deletion, 
\~possible nucleotide insertion) 


6040 






PKPSPIPSPVLGRKPNASQSLbVWCKEVTKNYRGVKITNFTTSW 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 
SRLLEPSDMVLIMPDKLrvm-YLYQIRAHFSGQELNVVQIEEN 
SSKSTYKVGNYETDTNSSVDQEKPYAELSDLKREPELQQPISGA 
VDFLSQDDSVFVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
SDTEPQKSQQSSGRTSGSDDPGlCSNTDSTQAQVIjLGKKRLtiKA 
ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENS RS LE CRS DPES P I KKTS LS PTS KLG YS YS RDLDLAKKKHA 

SLRQTESDPDADRTTLNHADHSSKIVQHRLbSRQEELKERARVL 
IiEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQLIAEARSGGKMSELPSYGERAAEKLKERSKASGDENDNIEI 
DTNEEIPEGFWGGGDELTNLENDLDTPEQNSKLVDLKLKKLLB 
VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 
RKPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAAITETQRKPSEDEVLNKGFKDS\SQYWGELAALENEQKQ 
I DTRAALVE KRLRYLMDTGRNTE EE EAMMQE W FMLVNK KNAL I R 
RKNQLSLLEKEHDLERRYELUJRELRAMLAIEDWQKTEAQKRRE 

QLLLDE^VALVNKRDALVHDr J3AQEXQAEEEDEHLERTLEONKG 
| KMAKKEEKCVLQ 


6041 


475 


j 1052 


PTALMTAPS CAFP VQFRQPS VSG LS Q I T KSLY I SNGVAANNKLM 
LSSNQ X TMVINVS VEWNTLYED I QYMQVPVADS PNS RL CD FFD 
PIADHIHSVEMKQGR\TI,LHCAAGVSRSAALCLAYLMKYHAMSL 
LDAKTWTKSCRPI IRPNSGFWEQLIHYEFQLFGKNTVHM VSSPV 
GMIPDIYEKEVRLMIPL 


6042 


2 


3886 " 


TEKDEKTAHNbENVLIHFWKKLSElCVAKISEPEADVESVLGVS 
NLE^VIiQKPKGSLKSSKKKNGKVRpADEILESNKENEKCVSSEQ 
EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 
VNERKSEQHLRFLSTLLDSFSSSRVFKMIiLGDEKQSIVQAKPLB 
IAKLVQKNPAVQFLYQKLIGWLNEDQRKDFGFLVDILYSALRCC 
DNDME R KKVIiDDLTKVDL KWNS LLX 1 1 EKACPSSDKHAL VTP WL 

KGDIU3EKLVNLADCIjCNEDLESRVSSESHFSERWTU»SLVLSQ 

HVKNDYLIGDVYVERIIVRLHETLFKTKKLSEAESSDSSVSFIC 

DVAYNYFSSAKGCLLMPSSBDLLLTLFQLCAQSKEKTHLPDFLI 

CKLKNTWUSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 

DINSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSEWEK 

MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHIjCT 

SALLSKMVLIALRKETVLENNELEKIIAELLYSLQWCEELDNPP 

IFLIGFCEILQKMNITYDNLRVLGNMSGLLQLLFNRSREHGTLW 

SLIIAKLILSRSISSDEVKPHYKRKESFFPLTEGNLHTIQSLCP 

FLSKEEKKEFSAQCIPALLGWTKKDLCSTNGGFGHLAIFNSCLQ 

TKS IDEX3ELLHGILKI I ISWKKEHEDIFLFSCNLSEAS PEVLGV 

NIEIIRFLSLFLKYCSSPLAESEWDFIMCSMLAWLETTSENQAL 

YSIPLVQLFACVSCDLACDLSAFFDSrTLDTIGNLPVNLISEWK 

EFFSQGIHSLLLPILVTVTGENKDVSETSPQNAMLKPMCETLTY 

I S KEQLLS HKLPARLVADQKTNLPE YLQTLLNTLAPLLLFRARP 

VOIAVYHMLYKLMPELPQYDQDNLKSYGDEEEEPALSPPAALMS 

LLSIQEDLLENVLGCIPVGOIVTIKPLSEDFCYVLGYLLTWKLI 

LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPENPTYAB 

TAVEVPNKDPKTFFTEELQLSIRETTMLPYHIPHLACSVYHMTL 

KDLPAMVRLWWNSSEKRVFNIVDRFTSKYVSSVLSFQEI3SVQT 

STQLFNGMTVKARATTREVMATYTIEDIVIELIIQLPSNYPU3S 

1 1 VESGKRVG VAVQQWRNWMLQLSTYLTHQNGS IMEGLALWKNN 

ywnAC iwvduu ixi-co Vi «^vWX&L»PKKACRTCKKKFHSA\CLY 

KWFTSSNKSTCSLCRETFF 




1306 


253 J 

( 
] 
( 
] 

1 ( 


^AEIiAPASPSuiKASVSNGDTTLLCSRkQSCGMNEVRQVSLTYP 
3SPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
SAQRAPGGI^YPAASPTPHAAFLADPVSNMAMAYGSSIjAAQGKE 
liVDKN I DR FI P 1 TKLKY Y FAVDTM YVGRKLGLL FFP YLHQD W E V 

JYO^DTP VAPRFDVNAPDLYI PAMAFITYVLVAGLAtiGTQDR FS 
'DLLGLQASSAIiAWLTLEVliAIIiLSLYLVTVNTDJjTTlDLVAFL. 
3YKYVGMIGGVLMGLLFGKIGYYLVLQWCCVAI FVFMIRTLRLK 
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ID 
NO: 


Predicted 
beg i nn ing 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Axginine, 
S=3erine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6043 


403 


599 


- LADAAAEGVPVRGARNQLRi1YLTMAVA7U\0PMLMYWLTFHLVR 
LCLFFPFPCATPVLPbPSHSAL./CLSHLSVSSWFCPCQPPL'PC 
PLP PLQNKT AKGSLSTEQSERG 


6044 


793 


412 


KLEM^FTLISKVKISREVTMIASKFGIGQQVRHSIJ^YLGVVV 
DI DP VYSLS E PS PDELAVNDELRAAPW YHVVME DDNG L P VHTYL 
AEAQLSS ELQDEH P \ EQPSMDEliAQTIRKQLQAPRLRN 


5045 


155 


2299 


SPLPQVAAWNYLRRRLSDSNFMANLPNGYMTDLQRPQPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFS EQ VGGG SGGAG RGGAAS RVLLVI DE PHT 
DWAXYFKG XK I HG B I D I KVEQAE FS DLNLVAHANGG FS VDME VL 
RNGVKWRSLKPDFVLIRQHAFSMARNGDYRSLVIGLQYAGIPS 
VNS LHS VYNFCDKP W V FAQM VRLH KKLGTE E FP LI DQT FY PNH K 

EMI^S\TTYPVWKMGHGTXWGWGFCVKVDNQHDFQDIASVVALT 
KTYATAE P F I D AKYD VT? VOK 7 firw vira ymp t ever vtw vmnri o x 

MLEQ I AMS DRY KLW VDTCSE I FGG LD I CA VEALHG KDGRDH HE 
WGSSMPLIGDHQDEDKQLIVELWNKMAOALPRQRQRDASPGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAOQRPPPQGGPPQPGPGP 
QROG P PLQQRP PPOGQQHLSGLG P PAGS PLPQRLPS PTSAPQQP 
ASQAAPPTQGQGRQSR PVAGGPGAP PAARPPAS PS PQRQAGPPQ 

ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVPRTGPPTTOOPRPSGPftpar:pt>K'DnT.arivT5cr\nt/riDnn*rT» 

AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRWAGPESLPPLPR 
SL IMDS PRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAPLS PGLPAMGGPGPGP CEDPAGAGGAGAGGSE PLVTVTVQCA 
FTVALRARRGADLSSLRALIiGQAIiPHQ\AQLGQLSYLAPGEDGH 
WVPI PESESLQRAWQDAAACPRGLQLQCRGAGGRPVLyQVVAQH 
SYSAQGPEDLG FRQGDTVDVLCE VDQAW LEGH CDGR IGI FP KCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


PVLVTSLRMREADTLRPPQU4EVSADIISTVEFNHTGELLATGD 
KGGRWIFOREPESKNAPHSOdPVnWQTPrtcuTTDPTjnvT vot o 

IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDBEGKLKDLSTVTS LQVP VLKPMDLMVEVS PRR I FANGHT YH 
INSISVNSDCETYMSADDLRINLWHLAITDRSFTP\NIVDIKPA 
NMEDLTE VI TAS E FHPHHCNLFVYS S S KGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL \NME AR P I ETYQVHDY LRS KIiCS LY ENDCI FDKFE CA 
WNGSDS V I MTG A\ YNN PFRM FDRNTKRD VTL\ EASRES S KP RAV 
LKPRRVCVGGKRRRDDISVDSLDFTKKILHTAWHPAENIIAIAA 
TNNLYIFQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS " 

KGTSNSS KTRAGANSKGRRGSQNSS EHRP PASSTSEDVKAS PSS 

ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 

EPTVLDRNCPSPVLIDCPHPNCNKKYKHINGLKYHQAHAHTDDD 

SI^EADGDSEYGEEPILHADLGSCNGXASVSQKXGSLSPARSAT 

PKVRLVE PHSPSPSS KFSTKG LCKKKLSGEGDTDLGALSNDGS D 

DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 

S ARP I / A P LAI PP QQ I YT FQTATFTAAS PG SS SG LTAT VAQAM P 

NS PQLKP IQPKPTVMGEPFTVNPALT PAKDKKKKD KKKKESS KE 

LES PLTPG KVCRAEEGKS P FRESSGNGM KM EGLLNGSS DPHQSR 

LAS I KAEADKI YS FTDNAPS PS I GGS SRLENTTPTQPLTPLHW 

TQNGAE AS S VKTNSPAYSD IS DAGEDG EG KVDS VKS KDAEQL VK 

EGAKKTLFPPQPQSKDSPYYQGFESYYSPSYAQSSPGALNPSSQ 

AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 

IQQRPNMYMQSLYYNQYAYVPPYGYSDQSYHTHLLSTNTAYRQQ 

YEEQQPCRQSLEQQQRGVDKKAEMGLKEREAALKEEWKQKPSIPP 

TLTKAPSLTDLVKSGPGKAKEPGADPAKSVIIPKLDDSSKLPGQ 

APEGLKVKLSDASHLSKEASEAKTGAECGRQAEMDPILWYRQBA 

EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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1 SEQ~ 
ID 

NO: 


1 Predicted 
1 beginning 

nucleotide 

location 

corresponding 

to first 
1 amino acid 

residue of 

amino acid 
| sequence 


| Predicted end" 

1 liUClcOl 1QB 

j location 

j corresponding 

to first 
1 amino acid 

residue of 
I amino acid 

sequence 


Amino acia segment containing signal peptic™ - " 
(A=Alanine, C=Cysteine, D«Aspaxtic Acid, E* 
Glutamic Acid, F-Phenylalanine, Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L~Leucine, Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T= Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *= s Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6049 






iiDGKESTSSlA.M.PTSEESRLGSKEPRPSVHVPVSSPLTQHOSy 
I P YMHG Y S YS QS YD PNHPS YRS MP AVMMQN YPGS YIiPSS YS FS P 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 

SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 

S PSQRLMSTHHHHHHIiGYS LLPAQYKLPYAAGLSSTAI VASOOG 
STPSLYPPPRR 




1 215 
j 566 


| 1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQBSPTLPESSAT 
DS3YYSPTGGAPHGYCS PTSAS YG\KALNPYQYQYHGVNGSAGS 
YPAKAYADYSYASSYHQYGGAYNRVPSATNQPEKEVTEPEVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 
GLTQTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QS PAVWE PQG S SRS LS HHPHAHP PTSNQ S PASS YLENSAS W YTS 
AASSINSHXiPPPGSLQHPLALASGTLY 


[ 6050 
6051 




j 1718 


KGLERTCCAMEESDSEKri'KKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRSIlFANYIARDTRRLGAriLD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIFKEREEDLR 
RLS R P LECS C FRTS I WDETLYKAWS S I VYQL I PNVQQLEMNLRN 
FAE 1 1 EADE VLLFERATFLVI SH YQCKEQRDAHRFEK I SNI I KQ 
FKl^CSKIAASFQSMEVRNSNFAAFIDIFTSNTYVMVVMSDPSI 
PSAATLIKIRNARKHFEKLERVDGPKQCLLMR 


6052 I 


566™ 


1718 


kolertc^eesdsekttkkenlgprmdpplgepgVgslgwvl 

PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYI ARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFOVBHSHVRFLGNLVLNLW 

dcggqdtfmenyftsqrdnifrnvevliyvfdvesrelekdmhy 

YQSCLEAILQNSPDAXI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAE 1 1 EADEVLLFE RATFL VT SH YQC KEQRDAHR FE KI SNI I KQ 
FKLSCS KLAAS FQSMEVRNSNFAAFI DI FTSNT YVMWMSDPS I 
PSAA7LINIRNARKHFEKLERVDGPKQCLLMR 




■ 566 


1718 


KGDERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKXFCVLLMGXSGSGKTSMRS 1 1 FANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVPDVESRELEKDI-tHY 
YQSCLEAILQNSPDAKIFCLVHKMDLVQEDQRDLIFKEREEDLR 
RLSRPLECSCFRTS I WDETLYKAWSS I VYQL I PNVQQLEMNLRN 
FAE 1 1 EADEVLLFE RAT FLVI 3 HYQCKEQRDAHRFEKI SNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDI FTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 j 

6054 I 


201 


1704 


KGTEMNKSRWUSRRRHGRRSHQQNPWFRLRDSEDRSDSRAAQPA " 
HDSGKGDDBSPSTSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 
HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLGFLNVTN YCHLAHE LRLSCMER KKVQ IRSMDP S ALAS D 
RFNL I LADTNSDRL FT VND VTVGGS K YG 1 1 NLQS L KTPTLKV FM 
HBNLYFTNRKV\NSVCWASLNHLDSHILLCLMGIiAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TG LSRR VLLTNWTGHRQS FGTNSDVLAQQ FALMAPLL FNGCRS 

G2IFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYI1MASD 
MAGKIKLWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGILVAVG 
QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLMA VGQDLYCYS YS 




1 [ 


1054 

1 
] 


P ? I ARLQE FGTSRRHMAAPSG VH L LV RRGS HR I FS S PLNHI YLH 
KQSSSQQRRN FFFRRQRDI SHS I VLPAAVSS AHPVPKH I KKPDY 
VrTGIVPDWGDSIEVKNEDQIQGLHQACQLARHVLLLAGKSLKV 
DMTTEEIDALVHREI I SHNAYPS PLGYGGFPKSVCTSVNNVLCH 
3 1 PDSRPLQDGDI INI D VTVYYNG YHGDTSETFLVGNVDECGKK 
LVEVARRCRDEAIAACRAGAPFSVIGNTISHITHQNGFQVCPHF 
/GHGIGSYFHGHPEIWHHANDSDLPMEEGMAFTIEPIITEGSPE 
fKVL EDAWTWS LD/ TS KVS AQ FEHT VL ITS RGAQ ILTKLPHEA 
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S5Q 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


oc^uicui. vi«Jiiu«AAmng signax peptide 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L~Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=*Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


605S 


421 


23^4 


PPYFLI^FIAWWLYGQSDRTETDISQSAGPPPGTLQCSALHKDP 

GCANCSRFCRIXISPPACQCHTIIVFPGNALNG^/QPPELSRTLAIiI 

S S RE PPRKKKKSQTETGKE RERTSFLTQGG KR FELQHGLAG I CM 

TLLITGDS IVSAEAVWDHVTMANRELAPKAGDVIKVLDASNKDW 

W WGQ I DDE EGN FPAS F VRLWVNH EDE VEEG PS DVQNGHLD PNSD 

CLCLGRPLQNRDOMRANVINEIM^TERUVT^ur tmrmr^vr 

R KRRDMFS DEQLKV I FGN I ED I YR FQMG FVRDLE KQYNNDDPKL 

SEXGPCFLEHQDGFWIYSEYCNNHLDACMELSKLMKDSRYQHFF 

EACRLLQQHIDIA\IDGFLLTPVQKICKYPL0LAELLKYTAQDH 

SDYRYVAAALAVMRNVTQQINERKRRLEN1DKIAQWQASVLDWE 

GED I LDRS S ELI YTG E MAW I YQP \ YG RNQQRVF FLFDHQMVbCK 

KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKLH 

NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 

NQKRQ AftMTVR KV PKQ KGVNS ARS VP PS Y PP PQDPLNHGQ YL VP 

\DG I AQSQVFEFTEPKRSQS PFWQNFSRLTPFKK 


6056 


43 " 


3358. 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLAYMESKGAHRAGLAKVI P PKEWKPRQCYDDIDNLLI PAP IQQ 
^WTGQSGLFTQYNIQKKAMrVKEFRQLANSGKYCTPRYLDYEDL 
ERKY WKNLTFVAP I YGADINGS I YDEG VDEWNIARLNTVLDWE 
EECGISIEGVNTPYLYFGMWKTTFAWHTEDMDLYSINYLHFGEP 
KSWYAIPPEHGKRIiERLAQGFFPSSSQGCDAFLRHKMTLISPSV 
LKKYGI PFDKITQEAGE FMITFPYGYHAGFNHGFNCAESTNFAT 
VRWIDYGKVAKLCTCRKDMVKISMDIFVRKFQPDRYQLWKQGKD 
IYTIDHTKPTPASTPEVKAWI^RRRKVRKASRSFQCARSTSKRP 
KADEEEEVSDBVDGAEVPNPDSVTDDLKVSEKSEAAVKLRNTEA 
SSEEESSASRMQVEQNLSDHIKLSGNSCLSTSVTEDIKTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
SVAESNGVLTEGEESDVESHGNGLEPGEIPAVPSGERNSFKVPS 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQOAPSDEELPEVLSI 

EEEVEETESWAKPLIHLWOTKPPNFAAEQEYNATVARMKPHCAI 

CTLtLMPYHKPDSJ3Nt?PMrihPiJCT , ifT nrmn , prnvmvr>r — 

* c ****^*soi3VinEii?iu*\iiYit!ti f\LiUt, V v Ao EGKTKPLI PEMCF 

IYSEENIEYSPPNAFLEEDGTSLLISCAKCCVRVHASCYGIPSH 

EICDGWLCARCKRNAWTAECCLCNLRGGALKQTKNNKWAHVMCA 

VAVPBVRFTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVSGA 

CIQCSYGRCPASFHVTCAHAAGVL\MEPDDWPYWNITCFRHKV 

NPNVKSKACEKVISVGQTVITKHRNTRYYSCRVMAVTSQTFYEV 

MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 

AKYFGSNIAHMYOVEFEDfi^OT 2iM^nT?nT vtt tm?pt nvDtmnn 
u w v£,r oijvaoy ±>ii v ii\j<iiUX i HjUciEIjPKRVKARF 

VSAGRCHIiGTCQVNSLSSPHVSQAQQETYLGFWINS KKSQCNI F 
LSGTY 


6057 


1 


833 


FVARLKEQEGEGGLG PR KE KGRARGRERRRKMQLTR CC FVFLVQ 
GSLYLVICGQDDGPPGSEDPERDDHEGQPRPRVPRKRGHISPKS 
RPMANSTLLGLIAPPGEAWGILGQPPNRPNHSPPPSAKVKKI FG 
WGD FY SN I KT VALNLLVTGXI VDHGNG T FS VHFQHNATGQGN I S 
ISLVPPSKAVEFHQEQQIFIEAKASKIFNC\RMEWEKVE\RGRR 
TSLFTHDPAKICSRDHAQSSATWSCSQPFKWCVYIAPYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6058 " 

6059 ~ 


i 


986 

i 


HPLPSASLGbPSVSIiGVSLCVRSALLEAVVPMLPKRRRARVGSP - *" 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHWMEETSAEEAVSWQERRMAAAPPGCTPPALLD 
IS WLTES LGAGQPVPVECRHRLEVAGPS KGPLS PAWMPAYACQR 
PTPLTHHNTGLSEALEILAEAAGFEGSEGRLLTFCRAASVLRAL 
PS PVTTLS QLQGLPHFG EHSS R WQELIiEHGVCEEVER VRRSB / 
RLFTQIFGVGVKTADRWYREGLRTLDDLREQPQKLTQQQKAGEP 
S RE AG PWAS LNCTLDPSAS TP 




2 


3650 

1 
1 


QQDFES LADLTDHRAH RC PGDGDDDPQLS WVAS SPSS KD VAS PT 
3MIGDGCDLGLGEEEGGTGLPYPCQFCDKSFIRLSYLKRHEQIH 
5 D KL ? FKCTYCS RLFKHKRS RDRHI KLHTGDKKYHCHE CEAAFS 
^SDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
SHIAKSEKEAXkDDFMCDYCEDTFSQTEELEKHVLTRHPQLSEK 
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GEQ 
ID 

NO: 


f 1 — 3-1 ~ — 3 

Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of • 
amino acid 
sequence 


Amino acid segment containing signal peptide""*" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lyeine, 
-L=Leucine, M«Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«*Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possibZ.e nucleotide insertion) 








ADbQCIHCPEVFVDENTLLAHIHQAHANQKHKCPMCPB\QFSSV 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
BRGSTPDSTLKPLRGQKKMRDDGQGWTKWYSCPYCSKRDFNSL 
AVLEIHLKTIHADKPQQSHTCQICLDSMPTLYNLNEHVRKLHKK 
HA YP VMQ FGNI S AFHCN YCPEMFAD I NSLQEH I R VSHCG PNANP 
SDGNNAFFCNQCSMGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 
VQPTQSFMEVYSCPYCTNSPI FGS ILKLTKHI KENHKNIPLAHS 
KKS KAEQS P VS SD VE VS S P KRQR LS AS ANS I SNG E Y PCNQCDLK 
FSNFESFQTHLXLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTS TH YVCESCDKQ FS S VDD \ LQKH\ LLDMPH PLCCTH CT \ L 
CQE VFDS \ KVS I \Q VHLAVKHS N E KKMYRCTACNWDFR KEADLQ 
VHVKHSH LGN PAKAHKC I FCG ET FS TEVELQCH I TTHS KKYNCK 
FCSKAFHAIILLEKHIiREKHCVFDAATEKGTANGVPPMATKKAE 
PADLQGMLLKNPEAPNSHEASEDDVDASEPMYGCDICGAAYTME 
VLLQNHRLRDHN I RPGEDDGSR KKAE FI KGSHXCNVCSRTFFSE 
NGLREHLQTHRG PAKH YMCPI CGER FPSLLTLTEHKVTHS KSLD 
TGTCRI CKM PLQ SEEEFI EHCQMH PDLRNS LTG FRCVVCMQTVT 
STLELKIHGTFHMQKLAGS SAASS PNGQGLQKLYKCALCLKEFR 
SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGLAPPEPADRPCA 
GliRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQCI KCQMTFENE RE I Q I HVANHMI EEGINHECKLCNQM 
FDSPAKIJjCHLIEHSFEGMGGTFKCPVCFTVFVQANKLQQHIFA 
VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


SYEIVGKNKLEVNHSQLKALCKCSLPSRLLPLGENLPLLDRGFR^ 
KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 

DISASRPNILLLMADDLG igdigcygnntmrtpnidrlaedgvk 

LTQHISAASLCTPSRAAFLTGRYPVRSGMVSSIGYRVLQWTGAS 
GGLPTNETTFAKILEEKGYATGLIGKWHLGLNCESASDHCHHPL 
HHGFDHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALV 
ALTLVAGKLTHL I P VSWMP VI WSALSAVLLLASS YFVGALI VHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKH5PPLLFV 
SFLHVHIPLITMENFLGKSLHGLYGDNVKEMDWMVGRILDTLDV 
EGLSNSTLI YFTS DHGGSLENQLGNTQYGGWNGI YKGGKGMGGW 
EGGIRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTWRLAGSEVP 
QDRVIDGQDLLPIiLIiGTAQHSDHEFLMHYCERFLHAARWHQRDR 
GTMWKVHFVTPVFQPEGAGACYGRKVCPCFX5EKVVHHDPPLLFD 
LSRDPSETHILTPASEPVFYQVMER\VQQAVWBHQRTLSPVPLQ 
LDRLGNIWRPWLQPCCGPFPUCWCLREDDPQ 


! 6061 


110 


133 0 


MNIHMKRKTIKNINTFENRMLMLDGMPAVRVKTELLESEQGSPN " 
VHNYPDMEAVPLLLNNVKGEPPEDSLSVDHFQTQTEPVDLSINK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASSPTVITS 
VS S AS S SST VLT PG PL VASAS G VGGQQFLH IIHPVPPSS PMNLQ 
SNKLS HVHR I PVWQS VP WYTAVRS PGNVNNT I WPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTAJjSIA 
RAVQE VHP S P VSRVRGNRMNNQKF PCS I S P FSI ES TRRQRTVLN 
PPDSRKTAYSTDCDF\EGLQQKL YTKSS S PGRVHRRTHTGE KPY 
KCTWEGCTW KFARS DELTRH YRKHTGVKP F KCADCD R S F S R SDH 
LALHRRRHMLV 




71 


1079 


ETMAKNGPENCEDCH I LNAEAFKS KKI CKSLKI CGLVFG I LALT 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEI FRSGNGTDETLEVHDFKNG YTG I YF VGLQKCF I KTQI KVI P 
&ro&Vnan±.VLNtZlLl L ITr FEQSVIWVPAEiCPIENRDFLKNSKI 
I»EICDNVTMYW\ IN PTL\ ISGTFAKQLHHNFAFI ILVSELQDPE 
EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENG IEFDPMLDERGYCCI YCRRGNRYCRRVCE PLLGYYP 
Y PYC YQGGR VI CRV I M PCNWW VARMLGR V 


6063 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILAIiT " 
L I VLFWGS KH FW PE VP KKA YDME HT F YS NG E K KXI YM EI D? VTR 
TEIFRSGNGTDETLBVHDFKNGYTGIYFVGLQKCFIKTQIKVIP 
E FSEPEEEI DENEEI TTTFFEQSVI WVPAEKPI ENRDFLKNSKI 
LB ICDNVTMYW\ INPTL\ ISGTFAKQLHHNFAFIILVSELQDFE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end - 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ATO.no acid seamen t contain i ner Qinnnl zrxzrrr~3 

z3 *- ^ w ***-ei-*iiijig signal peptide 

<A=Alanine, C*Cysteine, D=Aspartic Acid, E«= 

Glutamic Acid, ^Phenylalanine, G=Glycine, 

H=Histidine, I^isoleucine, K^Lysine, 

L=Leucine, M=Methionine, N=Asparagine, 

P=Proline, Q=Glutaniine, R=Arginine, 

S=Serine, T=Threonine, V=Valine, 

H=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 

Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 








ESGBDLHFPANEKKGIEQNBQWWPQVKVEKTRHAROASEEBLP 
INDYTENGIEFDPMLDBRGYCCIYCRRGNRYCRRVCKPLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMbGRV 


6064 


913 


311 


KLPQSLPRPTSHSPPYSLKKMTDUVAVWDVALSDGVHKIEFEHG 
TTSGKRWYVDGKEEZRKEWMPKLVGKBTFYVGAAKTKATINID 
AISGFAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LEOAMDVWCNGKKLETAGEFVDDGTETHFSIGTH\ACYIKAV\ 
SSG\KRKEGIIHTLIVDNREIPEIAS 


6065 
6066 


1153 


641 


MS VR VARVAW VRGLGAS YRRGAS S FPVP PPGAQGVAELLRDATG 
AEEEAPWAATERRMPGQCSVLIiFPGQGSQWGKGRGLLNYPRVR 
ELYAAARRVLGYDLLELSLHGPQETLDRTVHCQPAIFVASLAAV 
EKLHHLQPSVIENCVAAAGFSVGEFAALVFAGAMEFAEG 




68 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW 
EDliDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDG I LTRFTTNANHWFNGDGTKIA^GSSD\ FLVKI VDVMDSS 
QQKT FRGHDAP Vl*S LS FDPKDI FLAS AS CDGS VRVWQ I S DQTCA 
ISWPIjLQKCNDVINaKSICRLAWQPKSGKLLAIPVEKSVKLYRR 
ES WSHQFDLSDNFISQTLNI VTWS PCGQYLAAGS INGI» 1 1 VWNV 

etkdcmervkhekgyaicglawhptcgrisytdaegnlgllewv 
cdpsgktssskvssrvekdyndlfdgddmsnagdflndnaveip 

SFSKGIINDDEDDEDLMMASGRPRQRSHILEDDENSVDISMIiKT 

gssllkeeeedgqegsihnlplvtsqrpfydgpmptprqkpfqs 
gstplhlthrfmvwns igiircyndeqdnaidvefhdtsihhat 
hlsntlnytiadlsheaillacestdelasklhclhfsswdssk 

EWI IDLPQNE DI RAI CLGQG WAAAATS ALLLR LFTIGG VQ KE VF 

slagpwsmaghgeqlfivyhrgtgfdgdqclgvqllelgkkfck 
qilhgdplpltrksylawigfsaegtpcyvdsegivrmlnrglg 
ntwtpicntrehckgksdhywwgihenpqqlrcipckgsrfpp 

TLPRPAVAILSFKljPYCOTATPKnnMTriroir , wocuTtrtjf k TTJT nvr * 

KNGYEYEESTKNQATKEQQELLMKMI.ALSCKLEREFRCVELADL 
MTQNAVNLAIKYASRSRKLILAQKLSELAVEKAAELTATQVEEE 

eeeedfrkklnagysntatewsqprfrwqveedaedsgeaddee 

KPEIHKPGQNSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 
S KE PAMSMNSARSTNILDNMGKSSKKS^ALSRTTNNFJf c p t t vu 

LIPKPKPKQASAASYFQKRNSQTNKTBEVKEENLKNVLSETPAI 
C PPCNTENQRPKTGFQMWLEENRSNI LS DNPDFSDEAD 1 1 KEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


6067 


858 


321 


IiPWQRLGVLLSRGKMAVTGWLESLRTAQKTALLQDGRRKVliYLF 

PDGKEMAEEYDEKTSELLVRKWRVKSALGAMGQWQLEVGDPAPL 

GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 

SVDQKERCIIVRTTNKKYYKKFSIPDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNLSDSGEEP 
RGEAEAPHHGTGHPESAGEHALEPPAPAGASASTPPPPAPEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 

pavlqapqpkalsqtvpssgtng vs lpadctgavpaas pdtaaw 
rspseaadevcaleekepqknessnaseeeacekkdpatqqafv 
fgqnlrdr vkl i n es vdeadmenaghps adtptatnyflq y i s s 
slen3tnsadassnkfvfgqnmservlsppklnevssdanrena 
aaesgsesssqeatpekeslaesaaaytkatarkcllekvevit 
geeaesnvlqmqcklfvfdktsqswvergrgllrlndmastddg 
tlqsrlsdagprgslr\lilntklwaqmqidkasek\siritam 
dnedqgvkvflisasskdtgqvyaalhhrilalrsrveqeqeak 

mpapepgaapsneeddsddddvlapsgataagagdegdgqttgs 
r 


6069 


583 


27 


PTRPGQ AGS S 3AMAAQR LGKRVLS KLQS PS RARGPGGS PGGLQK ~~ 

rharvtvkydrrelqrrldvekwidgrleelyrgmeadmpdein 
idelleleseeersrkiqgllkscgkpvedfiSqellaklqglhr 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I-Isoleucinc, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q-Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAPGLRQPS PS?\DGQPSAPFQGPGARTASPLTLLALFPGP PER 
RPALLCVLSCI 


6070 


478 


858 


I RVTVDGEPLHY I FPLQFbDS PEW/R5TETHRGRHF\QVTLTAE 
TDCRYVSWRRKKLYLL FAQHRY ISRbFS VLIGSD I ADKLY AUTO 
RVYIGKRYKYDIRLPNFYQMSTPEIRRSPLTQHFQNSRRYW 


6071 


2 


1654 


HEARTKGNMALAR P \ VRLFSLVTRLLLAPRRGLTVRSPDE PLP V 
VR1 P VALQRQLE QRQSRR RN LP RP VLVRPG PLLVS ARR P ELNQ P 
ARLTIX5RWERAPLASQGWKSRRARRDHFSI ERAQQEAPAVRKLS 
SKGSFADLGAWKPRVLHALQE\AAPEWQ\ PTTVQSSTI PSLLR 
GRHWCAAETGSGICrLSYLLPLLQRLIiG\HPSLDSLPIPAPRGL 
VLV PSR3 IiAQQ VRAVAQ P LGRS LG LLVR DLEGGHGMRR I RLQ LS 
RQPS AD VLVATPGALWKALXS RLI SLEQLS FLVLDEADTLLD2S 
FLSLVDYILEKSHIAEGPADLEDPFNPKAQLVLVGATFPBGVGQ 
LLN KVAS PDAVTTI TSS KLHCIMPHVKQTFLRLKGADKVAELVH 
I LKHRDRAERTG PS GTVLV FCNS S S TVNWLG Y I LDDHK I QHLRL 
QGQW P ALMR VG I FQS FQ KS S RDI LLCTD I ASRGLDS TGVE LWN 
• YDFP PTLQDYIHRAGR VGRVGSEVPGTVIS FVTH P WDVSLVQKI 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


KME RTEMMPT INSQLEFKSKP FPL VS S SRWLVKRGE LTA Y VEDT 
VLFSRRTSKQQVYFFLFNDVLIITKXKSEESYNVNDYSLRDQLL 
VESCDNEELNSSPGKNSSTMLYSRQSSASHLPTLTVLSNHAjNEK 
VEMLLGAETQSERARW I TALGHSSGKPPADRTSLTQVE I VRS FT 
AKQPDELSLQVADWIiI\YQRVSDGWYEGER\LRDGERGWFPME 
CAKE ITCQAT IDKNVERMGRLLGLETNV 


6073 


620 


860 


PCRRGLARPLSRRPG/ S I LVHCAVGVSRSATLVLAYLMLYHHLT 
LVEAIKKVKDHRGI I PNKGKLRQLtiALDRRLRQGLEA 


""6074 


168 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA 
GGNRASRAKVILLTG YAHS SLPAELDSGACGGSS LNS EGNSGSG 
DSSSYDAPAGNSFLEDCELSRQIGAQLKLLPMNDQI RELQTI IR 
DKTASRGDFMFSADRLIRLWEEGLNQLPYKECMVTTPTGYKYE 
GVKFEKGNCGVSIMRSGEAMEQGLRDCCRSIRIGKILIQSDEET 
QRAKVYYAKFPPDIYRRKVLLMYPILQTG\NTVIEAVKVLIEHG 
VQPSVI I LLSLFSTPHGAKS I1QEFPEI TI LITE VH PVAPTHFG 
QKYFGTD 


6075 


320 


1091 


P?TCQPQEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT 
KLGPE IERAECTI RMNDAPTTG YSADVGNKTTYRVVAHSS VFR V 
LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFITEKRVPSSWAQLYGITFSHPSWT 


6076 


1721 


107 


hpspteaprvqhltmdctwrIlflvaaatgthaqv'Olvqsgabv 
kjcpgas vkvs ckvsg ytlte ls mhwvrq apg kgle wmg af dpe d 

GETI YAQKFQGRVTMTEDTSTDTAYMELSSLRS edtavy ycatd 
HGDYAFDIWGQGTMVTVSSAPTKAPDVFPIISGCRHPKDNSPW 
IACLITGYHPTSV\TVTWYWGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQWRQGEYKCWQHTASKSKKEIFRWPESPKAQASSV 
P7AQPQAEGSLAKATTAPATIRNTGRGGEEKKKEKEKEEQEERE 
TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 
AHLTWEVAGKVPTGG VE EGLLERHS NGS Q SQHS RLTL PRS L WNA 
GTS VTCTLNHPSLPPQRLMALREPAAQAPVKLS LNLLAS SDPPE 
A\ASWLLCEVSGFSPPN I LLMWLEDHGEVNTSG FAPARPLPKP \ 
RSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
BVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPIFWIGLISSVCCVFAQTDENRCLKANAKSCGECIQ 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GS KD I KKNKNVTNRS KGTAEKLKPEDITQ IQ PQQLVLRLRSGBP 
QTFTLKFKRAEDYPIDLYYLM\DLSYSMKDDLENVKSLGTDLMN 
EMRRITSDFRIGFGSFVEKTVMPYISTTPAKLRNPCTSEQNCTS 
PFSYKNVLSLTNKGEVFNELVGKQRISGNLDSPEGGFDAIMQVA 
VCGS LI GWRUVTRLLVFS TDAG FHFAGDGKLGG I VLPNDGQCHL 
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{ SEQ 
i m 

I NO: 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"*] 
(A-Alanine, 0=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


T 6078 






ennmytmshy^ypsiahlvqklsenniqtifavteefxjpWxeH 

LKNLI P KS AVGTLS ANSSNVI QL I IDA YNS LS SEVI LENGK LS E 
GVTISYQSY\CKNGVNGTGENGRKCSNISIGDEVQFEISITSNK 
CPKXDSDSFKIRPLGFTEBVEVILQYICECECQSEGIPESPKCH 
EGNGTFBCGACRCNEGRVGRHGECSTDEVNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNE I YSGKFCECDNFNCDRS 
NGLI C3GNGVCKCRVCECNPNYTGSACDCSLDTSTCEASNGQ IC 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKIiPQPVQPDPVSHCKEKD 
VDD CWFYFT YS VNGNN E VMVHWENPEC PTGPDI I P I VAGW AG 

iVLIGIiALLLIWKLLMIIHDRREFAKFEKEKMNAKWDTGENPIY 
KSAVTTWNPKYEGK 


£079 


1426 


1B0 


BTEDVMELLEEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGILEj 
GSVRNSLWRPVPFKCPTCRKKTFSYWELIPLQVNYSLKGIVEKY 
NKIXISPKMPVCKGH\LGQPLNIF\CL\TDMQLDL/CGIC\ATR 
GEHTKrlVFCSIEDAYAQERDAFESLFQSFETWRRGDALSRLDTIi 
BTSKRKSLQLLTKDSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQA YD P E INKLNT I LQEQRMA FNI AE AF KD VS E P I V FLQQM 
QEFREFCIKVIKETPLPPSNLPASPLMKNFDTSQWED1KLVDVDK 
LS L PQDTGT FI S KI P WS FYfCL FLL I LLLGLVl VFG PTM FLEWS L 
FDDLATWKGCLSNFS S YLTKTADFI EQSVFYWEQVTDGFFI FNE 
RFKNFTLWLNNVAEFVCKYKLL | 




lSi86 


141 


atardlgcarridrvvmestpsrglnrvhlqcrnlqeflgglspH 

GVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
LVfVTCKEFSKAQEESTGLLSGJjRIWHTQLLPGGLOGLILNPIFRQ 
NLRIALLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGSPSAAVSQDLAQLLSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKD 
YSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTBSELQIALIALFSE 
MLY PF P \KMW \ AR VTR \ ES VQQAI AS G ITAQQ I IHFLRTRAHP 
VKLKQT? VLPPTI TDQ I RLWE LERDRLRFTEG VL YNQFLS Q VDF 
ELL \ LAHAPKLGVLVFE /NTPAKRLM WTPAGHSDVKR FWKRQK 
HSS | 


6080 


1 


1199 


IETIDHVGEFAJ^QAAGVSRQRAAT0GIX5SNQNALKYLGWjH 
TLRQQCLDSGVLFKDPEFPACPSALGYXDLGPGSPQTQGIIWKR 

ptelcpspqfivggatrtdicqgglgdcwllaaiasltlneell 

YRWPRDQDFQENYAGIFHFQPLCPPS?\FWQYGEWVEWIDDR 
LPTKNGQLLFLHSECX5NEFWSALLEKAYAKLNGCYEALAGGSTV 
EGFEDFTGGISEFYDLKKPPANLYQIIRKALiCAGSLLGCSIDVy 
S AAE AE AI TSQ KL VKSHAYS V1X3 VE E VNFQGHPEKL I RLRNPWG 
EVEKSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 

QFSRLEICNLSPDSLSSEEVHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 


6081 
f 6082 


3 


865 


EMLPLLLPLPLLWA/GALAQDARFRLEMPESVTVQEGLCIFVHC[ 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLSIRDARRRDWGSYFFWVARGRTKFSY 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
P P I FS WM S AAPTS LG PRTLH S S VLT 1 1 P RPQDHGTNL I CQ VT FP 

GAGVTTERTIQLSVSMKSGTVEEVWLAVGWAVKILLLCLCLI 
ILSFHKKKAVRAVEVEENVYAVMG | 


I 6083 


283 
1865 


1288 
309 


EARSPOPTOTRTAPGIiAAPfiriAODanr pi r r cDDBon7v?r^r.ys — 1 
wj-uwf w« * v*" A«vr«ijrt«ruijMyt'HMjuKijijuoKrPSAAWDGDGD I 

PESVGQPEEASPEEQPEEASAEEERPBDQQEEEAAAAA\Y\LDE 

L PEPLLA/ LRVLAALPRHE \LVQACR \LVCLRWKELVDGAPLWL 

LKCQQEGLVPEGGVEEERDHWQQFYFLSKRRRNLLRNPCGEEDL 

EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 

KAQVIDLQAEGYWE ELLDTTQPAI WKDWYSGRfl DAGCLYBLTV 

KLLS EH ENVLAE FS SGQ VAVPQDSDGGGWME I SHT FTD YGPG VR 

FVRFEHGGQDSVYWKGWFGARVTNSSVWVEP | 

KQWCAERRGLGMSLADELliADLEEAAEBEEGGSYGEEEEEPAIE^ | 
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SSQ 
ID 
NO: 



"608T 



6085 



Predicted 
beginning 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



1865 



"6087- 



TtT" 



Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



-309- 



~1456" 



Amino acid segment containing signal peptide 
<A=Alanine. C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine # I=Isoleucine, K=Lysine, 
L=Leucine, M»Kethionine, N=Asparagine, 
P-Proline, Q=Glutaraine, R=Arginine, 
S -Serine, T=Threonine, VsValine, 
^Tryptophan, Y=Tyrosine, • X»Unknown, ♦^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 

"vgKKTQLDLSGDSVKTIAKL WDSKMFABIMMKIEEYISKQAKA " 
SEVMGPVEAAPEYRVIVDANNLTVEIENELNIIHKFIRDKYSKR 
FPSLESLVPNALDYIRTVKELGNSLDKCKNNENLQQILTNATIM 
WS VTAS TTQGQQLS EE EL ERLEEACDMALELNAS KHR I YE YVE 
SRMSFIAPNtiSI I IGASTAAKIMGVAGGLTNIjSKMPACWIMLLG 

aqrktlsgfsstsvlphtgyiyhsdivqslppipppfsvapXdl 

RRJCAARLVAAKCTLAARVDSFHESTEGICVGYELKDEIERKFDKW 

qepppvkqvkplpapldgqrkkrggrryrkmkerlglteir\kq 
anrmsfgeieedayqedlgfslghlgksgsgrvrotovneatka 
risktlqrtlqkqswyggkstirdrssgtassvaftplqglei 
i vnpqaaekkvaeanqkyfssmaeflkvkgek sglmst 

KQWCAERRUiiGMSI^ELLADLEEAAEEEKGGSYGEEEEEPAIE 
D VQEETQLDLSGDS VKTIAKLWDS KMFAE I MMKI EEYI SKQAKA 
! SEVMGPVBAAPEYRVIVDANNLTVEIENELNIIHKFIRDKYSKK 
. FPELESLVPNALDYIRTVKELGNSLDKCKNNENLiQQILTNATlM 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
SRMS FI APNLS 1 1 IGASTAAKI MGVAGGLTNLSKMPACNI MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\Db 
RRKAARL VAA KCTLAARVDS FHES TEGKVG YELKDE I E R K FDKW 

QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
| ANRMS FGE I E EDAYQ EDLG FS LGH LG KSGS GR VRQTQVNE ATKA 

R I S KTLQRT LQKQS VV YGGKST I RDR SSGTAS S VAFTPLQGL E I 
■ VNPQAAEXKVAEANQKYFSSMAEFLKVXGBK SGLMST 

sgprsfqgnravgrjsu;gkrnpevtllpgvsservrrwrrarv~ 
I gvarvk pgnpwkpspatqvpr/vpaqvylpgrgpplregeelvm 

DBEAYVIiYKRAQTGAPCLSFDIVRDHLGDNRTELPLTLYIiCAGT 

qaesaqsnrlmmlrmhnlhgtkpppsegsdeeeeeedebdbeer 
I kpqlelamvphygginrvrvswlgeepvagvwsekgqvevfalr 

rllqweepqalaaflrceqaqmkp I fsfaghmgegfaldws PR 
I vtgrlltgdcqknihlwtptdggswhvdqrpfvghtrsvedlqw 

sptentvfascsadasiriwdiraapskacmlttatahdgdvnv 
I iswsrrepfllsggddgalkiwdlrqfksgspvatfkqhvapvt 

svewhpqdsgvfaasgadhqitqwdlg/iverdpeagdveadog 

ladlpqqllfvhqgetedkelhwhpqcpgllvstalsgftifrt 

ISV 



1357 



GAATQHGGAMNLLPCNPHGNG LLYAG FNQUHGCFACGMEA'GFR' V - 

yotdplkekekqefleggvghvemlfrcnylalvgggkkpkypp 

NKVMIWDDLKKKTVIEIEFSTEVKAVKLRR\DKIVWLDSMIKV 
I FTFTHNP \HQLHVFE\TCYNPKGLC VLCPNSNNSLLAFPGTHTG 

hvqlvdlastekppvdipahegvlscialnlqgtriatasbkgt 

LIRIFDTSSGHLIQELRRGSQAANIYCINFNQDASblCVSSDHG 

TVHIFAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 

1 CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL 



QNS QRTGL P J.TI FS RS F PLLTGS DLCENMPCTCTWRN W RQW I R P 
LVAVI YLVS I WAVPLCVWELQKLEVGIHTKAWFI AGI FLLLTI 
PISLWVILQHLVHYTQPELQKPIIRILWMVPIYSLDSWIALKYP 
GIAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPNLVLILEAKD 
QQKHFPPLCCCPPWAMGEVLLFRCECLGVLQYTVVRPFTTIVALI 
CELLGIYDEGNFSFSNAWTYLVIINNMSQLFAMYCLLLFYKVLK 
EELSPIQPVGKFLCVKLWFVSFWQAWIALLVKVGVISEKHTW 
EWQTVEAVATGLQDFIICIEMFLAAIA\HHYTFSYKPYVQEAEE 
GSCFDSFLAMWDVSDIRDDISEQVRHVGRTVRGHPRKKLFPEDQ 
DQNEHTS LLS SSSQDA I S I AS S M P PS PMGHYQG FGHT VTPQTT P 
TTAKISDEILSDTIGEKKEPSDKSVDS 



G ASGLVRLLQQGHRCLLAP VAP KLVPPVRGVK KG FRAA FRFQKE 
LERQRLIiRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 
KTAFVNS CYIKS EEAKRQQLG IEKEAVLLNLKSNQELSEQGTS F 

sqtcltofledeypdmptbgiknlvdfltgeewchvaAklavb 

QLTLSEBFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
TQMTGKELFEMWKIINPMGLLVEELKKRNVSAPESRLTRQSG\A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rieaictea eiiQ 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(AsAlanine, (^Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Hiotidine, I=Isoleucine, K^Lysine, 
L-Leucine, M«Methionine, N^Asparagine, 
P= Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPL YF VG LYCDKKLI AEG PGET VL VAE B BAAR VALRKLYGF" 
TENRRPWNYS KPKETLRAEKS ITAS 


6089 


3 


3054 


TRLGIPGSTISSRPRLCALAAEGHFICHSWTCSRAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQSliVKHSSGIKGSLPLQK 
LHLVSRSIYHSHHPTLKLQRPQLRTSFQQFSSLTNLPLRFCLKFS 
PIKYGYQPRRNFWPARIiATRLLKLRYLILGSAVGGGYTAXKTFD 
QWKDMI PDLSEYKWI VPDIVWEI DE YIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VS DKEK I DQLQ EELLHTQ LK YQR I LERLE KENKELR KL VLQKDD 
KG I ? FI E S LR KS LI DMYS E VLDVLS DYDAS YNTQDHL P R VVWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
LFKDSSREFDLTKEEDLAALRHEIELRKRKNVKEGCTVSPETIS 
LNVKGPGLQRMVLVDLPGVINTVTSGMAPDTKETI FS I SKAYMQ 
DPNAI I LCIQDGSVDAERS I VTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRIQQIIEGKLFPMKAIX3YFAWTGKGNSSESIEAI 
REYEEEFFQNS KLLKTS MLKAKQ VTTRNLS LA VS DC FW KMV RE S 

VEQQADSFKATRFNLETEWKNKYPRLRELDRNELFEKAKNEILD 
E V I S LSQVT P KHWEE I LQQSliWERVSTHV I EN I Y LPAAQTMNSG 
T FNTT VD I KLKQWTDKQIjPNKAVE VAWETLQ BE FS R FMTE P KG K 
EHDDI FDKLKEAVKEES I KRHKWNDFAEDSLRVIQHNALEDRSI 
SDKQQWDAAIYFMEEAI^ARLKDTENA1EKMVGPD\WKKRWLYW 
K^TQEO^mNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVEVDPSLIKDTWHQVYRRHFLKTALNHCNLCRRGFYYYQRH 
FVDSELECNDWLFWRIQRMLAITANTLRQQLTNTEVRRLEKNV 

KEVLEDFAEDGEKKIKLLTGKRVQIiAEDLKKVREIQEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVIiEQAS/ASPPLATQTWPLQHCidiPELPVQASIL 
FELQLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHXID 
FNLLMVTTIVU3RRFIGSIVKEASQRGKVSLFRSILLFLTRFTV 
LTATGWSLCRSLlHLFRTYSFLNLL/FPLLSVWDVHSVPAAELiR 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CMARTPCP/PHACCLSPSLIRSEVEFLKMDFNWRMKEVLVSSML 
SAY YVAFVPVW FVKNTH Y YD KRWS CE LFLL VS I S TS VILMQHLL 
PASYCDLLHKAAAHLGCWQKVT)PALCSNVLQHPWTEECZPWPQGV 
LVKKS KNVYKAVGH YNVAI PSD VSH FRFHFFFSKPLRILNI LLL 

LEGAVIVYQLYSLMSSEKWHQTISIALILFSNYYAFFKLLRDRL 
VLGKAYS YSAS PQRDLDHR FS 


6091 


3279 


412 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG" 

WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 

PSDP PADHAVR PLHGARGGQ PP VPQQHVLERQVOIiSQGQNWl K 

VKP PS KS GS ASASGAQRGS LEE FEDTPWSDQRPREGEGEP PRGQ 

IiQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 

PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 

LLGDRRVDAGHTDQPVPSGSVGGPARPASGPRQAREASLWTCR 

TNKFRKNNYKWVAAS S KS PR VAR RALS PR VAAENVCKAS AGMAN 

KVEKPQLIADPEPKPRKPATSSKPG9APSKYKWKASSPSASSSS 

S FRWQS EAGSKDHASQLS PVLSRS PSGD \ RPAVGHSGLKPLSGE 

TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 

RRQALRGKSSPVLKKTPNKGLVQVTTHRLCRLPPSRAHLPTKEA 

SSLHAVRTAPTSKVI KTRYR I VK KTPASPLSAPPFPLSLPSWRA 

RRLSLSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYRCIGGVLY 

KVSANKLS KTSGQPSDAGSRPLLRTGRLD PAGSCSRS LASRAVQ 

RSLAIIRQARQRREKRKEYCMYYNRFGRCNRGERCPYIHDPEICV 

AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGICSNSN 

CPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 

AC PRGAQCQLLHRTQKRHS RRAATS PAPG PS DATARS R VSASHG 

PRKPSASQRPTRQTPS SAALTAAA VAAPPHCPGGSAS PS SS KAS 

SSSSSSSSPPASU3HEAPSL0EAALAAACSNRLCKLPSFISLQS 

S PS PGAQPRVRAPRAPLTKDSGKPLH I KPRL 


6092 


143 


3190 ; 

1 


^KAPPTGESSEPEAKVIiHTKRLYRAVVEAVIfRLDIjIIiCNKTAYQ 
SVFKPEIJISLRNKIiRELCVKLMFLHPVDYGRKAEELLWRKVYYE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
SsSerine, T= Threonine, V=Valine r 
W=Tryptophan, Y*=Tyrosine, X=Unknown, *eStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








V1QL I KTNKKHI HSRSTLECAYRTHLVAG IG F YQHLLL Y IQSH Y 
QLELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDLSRYQNELAGVDTELLAERFYYQALSVAPQIGMPFNQLGTL 
AGSKYYNVEAMYCYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPDL 
lilFQMVIICLMCVHSLBRAGSKQYSAAIAFTLALFSHLVNHVNI 
RLQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSDLSEGFESDSSHD 
SARAS EG SDSGS DKS LEGGGT AF DAE TD S EMNSQES R S DLEDME 
EEEGTRSPTLEPPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQ 
MFQTKRCFRliAPTFSNLLLQPTTNPHTSASHRPCVNGDVDKPSE 
PAS E EG S ES EGS E S S GRS CRNE RSI QEKLQVLMAEGLL PAVKVF 
LDWLRTNPDLIIVCAQSSQSLWNRLSVLLNLLPAAGELQESGLA 
LCPEVQDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLSTLEESWRI CCIRS FGHF IARLQGS ILQFWPE VGIF 
VS IAQS EQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 
SQLEGS LQQP KAQS AMS P YL VP DTQALCHHL P V I RQLATS GRF I 
VI I PRTVIDGLDLLKKBHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAWTLYKILDSCKQLT\LAQGAGEEDPSG 
MVTI ITGLPLDNPSLLSGPMQAALQAAAHASVDIKNVLDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALR VAR T / S RWGAL \ RGAVWAPGTR PS KRRA C WALL 
PPVPCCLGCLAERWRLRPAALGLRLPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTIPNMLSMTRIGLAP 
VLGYL 1 1 hfaU rNI ALG VFALAGLTDLLDG F IAKNWANQKSAIaSS 
ALDPLADKILISILYVSLTYADLIPVPLTYMIISRDVMLIAAVF 
Y VR YRTL PTPRTL A K Y FN PC YATAR L K PTFI S KVNTAVQL I L VA 
ASLAAP VFNYADS I YLQILWCFTAFTTAASAYS YYHYGRKTVQV 
IKD 


6094 


23 


1010 


PFLRCLRGDOKAKMSER KVLNKY Y P PDF!) PS KI PKLKLPKDRO Y 
VVRLMAPFNWRCKTCGEYIYKGKECFNARKETVQNEVYLGLPIFR 
FYI KCTRCLAE I TFKTDPENTDYTMEHGATRNFQAEKLLEEEEK 
R VQ KE R EDEELNN PMKVLENRTKDS KLEME VLENLQEL KDLNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQP YTPDAWRVLPEPTGCI PGQ 


6095 


1 . 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFAL I NKLDIQCDLKTLSDD I KESLESEGKNS KKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS:14I6257.l(%CSH0I!.DOC) 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A-Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P^Proline. Q=Glutamine, R~Arginine, 
S=sSerine. ToThreonine v-Uai inm. 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6096 






VKVHTVPKPGKGADLSKPPCHKAKEIRKERKRLKLMQQ^PAGEL 
BGFOAQGH P PS L F P PKAKSNQ PKS LEDL I FES LPENASHKLE VR 
WRSSPPSSQFKATLLESyQVYKRYQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLHEKTSQLSYYY 
MG F Y I HSC P KM K YKGQYRPSDLLCP ETYVWVP I EQCL PS LENS K 
j YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKXQQKDPS 
BBAAVLQYASLVGQKCSERMLLFRN 


6097 


2277 


575 


QR VRAALLs S AM E DS EALG FEHMGLDPRLLQ AVTOLG WS RPTL I 
RAi e LiAliEGKDLLARARTGSGKTAAYAI PMLQLLLHRKATG P 
VVEQAVRGL VLVPT KELARQAQSM I QQLATYCARDVRVANVS AA 
EDSVSQRAVLMEKPDWVGTPSRILSHLQQDSLKLRDSLELLW 
DEADLLFSFGFEEELKSLLCHLPRIYQAFLMSATFNEDVQALKE 
LIIiHNPVTLKLQESQLPGPDQLQQFQWCETEEDKFLLLYAIjLK 
LSL1RGKSLLFVNTLERSYRLRLFLEQFSIPTCVLNGELPLRSR 
CHI ISQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARANNPGIV 
LT FVLPTEQ FHLGKI EELLS GENRG P I LLP YQ FRMEE I EG FR YR 
CRDAMRS VTKQAIREARLKEI KEBLLHSEKLKTYFEDNPR \DLQ 
LLRHDLPLHPAVVKPHLGHVPDYLVP PALRGLVRPHKK\GRS CL 
PLVGRPREQSPRTHCAASSTKERNSDPQPSPPEWGPLWS 




1673 


192 


APGTMSGGKKKSSFQITSVTTDYEGPGSPGASDPPTpQPP^pp 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLPHGLGEP 
YRRGRWTCVDVYERDLEPHSFGGLLEGIRGASGGAGGRSLDSRL 
ELASLGLGAPTPPSGLSQGPTSWLRPPPTSPGPQARSFTGGLGQ 
LWPSKAKAEKPPLSASSPQQRPPEPETGBSAGTSRAATPLPSL 
RVEAEAGGSGARTPPLSRRKAVDMRLRl^ELGAPEEMGQVPPLDS 
RPS S PAL YFTHDAS L VHKS PD PFGAVAAQ KFS LAH SMLA I SGHL 

DSDDDSCJS^QTiVn t nMv t irriT* unr nr/^nr 

jouL/uaoo^L vtr l UNA I kQAMDLVKSHLMFAVREEVEVLKEQI 

RELAERNAALEQEMGLLRALA\SPEQLGSAGPPRGVPR\LGPPA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 
VQALS NGP WS PGPL PHLL 1 1 PS LDGG G EG FRTGRQQGAP FGE ET 
QPPPSLPGTPQQ 


; 6098 
6099 


168 


1074 


MfCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 

RLNSASORS*?^TiPPQMPTf CQTDTlTDtTTMT Tmrrpirtl l/nr,™«^T,- 

jno y iw ao utr f o n rciva o l f l KRE l MLT P VTVAYSPKRSPKE 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\ QFLPLDDSE EK \ T YS EKAT \DNI VNHS SC P E P VPNG VKKVS VR 

TAWEKNKSVSYEQCKPVSVTPQGNDFBYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 


6100 


l£8 


1074 


KYULRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGK I FKNWGTOTE KEDTSN I NPROTCTCVMa. c d c d p vn * Dn «■ 

RLNSASQRSSSLPPSNRKSSTPTKREIMLTPV'IVAYSPKRSPKE 
NLSPGFSHLLSKNESSPIRPDILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYS EKAT\DNI VNHS SCPEPVPNG VKKVS VR 
TAW2KNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRSFGKD 




2 


713 


FVEVSGYRSRADPEPRGRDTMTYAYLFKYIIIGDTGVGkSCLLL 
QFTDWIFQPVHDLTIGVEFGARMVNIDGKQIKLQIWDTAGQESF 
RS ITRS Y YRGAAGALL VYDI TR R ETFNHLTS W LEDARQHS S SNM 

VIMLIGNKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAFINTAKEIYRKIQQGLFDVHNEANGIKIGPQQS1STSVGP 
SASQRNSRDIGSNSGCC 


6101 


1 r 


1399 

< 

J 


FRGRAWPLREVSHWLGCRRVCSWSASWGRLPALSARLSPLLAFR 
3 KM VFPLS CAVQQYAWG KMGSNSE VARLLAS S D PLAQ IAED KP Y 
\BLWMGTHPRGDAKILDNRISQKTLSQWIAENQDSLGSKVKDTF 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

ciuiJ.no aClQ. 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" - 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L= Leucine, M= Methionine, N=Asparagine, 
P=?roline, Q=Glutamine, R=Arginine, 
S=Serine, T=»Threonine , V«=Valine, 
W^Tryptophan, Y«=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFLFKVLSVETPLSlQAHPNKEIAEKLHI^APQfaYPDAMH 
KP EMA I ALTP FQGIjCG FRP VEE I VTFLKKVP EFQFL IGDEAATH 
LKQTMSHDSQAVASSI^SCFSHI^llCSEKiCVVVBQLNLLVKRISQ 
QAAAGNNMEDIPGELLLQLHQQYPGDIGCFAIYFLNLLTLKPGE 
AMFLEANVPHAYLKGDCVECMACS DNTVRAGLTPKFI DVPTLCE 
MLSYTPSSSKDRLFLPTRSQEDPYLSIYDPPVPDFTIMKA\EVP 
G\S VTEYKDLAIiDSAS ILLMVQGTVIASTPTTQTP1 PLQRGGVL 
FIGANESVSLKLTEPKDLLI FRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAG5IGASPAAPCCSESGDERKN 
LEE KS DI NVTVLIGS KQ VSEGT DNGDLPS YVSAF I E KEVGNDLK 
SLKKIoDKLIEQRTVSKMQLEEQVLTISSEIPKRlRSALKNAEES 
KQFLNQFLEQBTHLFSAINSHLLTAQPKMDDLGTMISQIEEIER 
HLAYLKW I SQ I EELSDNIQQYLMTNNVP EAASTLVSMAELD I Kb 
QESSCTHLLGFMRATVKFWHKILKDIOjTSDFEEILAQLHWPFIA 
PPQSQTVGLSRPASAPEIYSYLETLFCQLLKLQTSHELLTEPK\ 

hsqkntlflppllss/wpiqvnltplqkrfryhfrgnrqtnvls 

KPEWYLAQVLMWIGNHTEFLDEKIQPILDKVGSLVNARLEFSRG 
LMMLVLEKLATDIPCLLYDDNLFCHLVDEVLLFERELHSVHGYP 
GTFAS CMH I LS E ETCFQRW LT VER KFALQ KMDS MLS S E AAWVS Q 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKNLPTASRKLQFIiE 
LQKDLVDDFR IRLTQVMKEETRASLGFR YCAILNAVNYI STVLA 
DWADl^FI^LOQAALEVFAENNTLSKLQLGQLASMESSVFDDM 
INLLERLKHDMLTRQVDHVFREVKDAAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLEKIFWQMLVEKLD 
VYIYQEIILANHFNEGGAAQLQFDMTRNLFPLFSHYCKRPENYF 
KH I KEACI VLNLNVGSALTAGKDVLPVQLQGS FPAT 


6103 


207 


2523 


ESNSTMTTYLEFIQQNEERDGVRFSWNVWPSSRLEATRMWPVA 
ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 
ACNFCYQRNQFPPSYAGISELNQPAELLPQFSSIEYWLRGPQM 
PLIFLYWDTCMEDEDLQALKESMQMSLSLLPPTALVGLITFGR 
MVQVHEU5CEG I SKS YVFRGTKDLSAKQLQBMLGLSKVP VTQAT 
RG PQ VQQ PP P S NR FLQP VQK I DMNLTDLLGELQRDPW P V PQGKR 

PLRSSGVALSIAVGLLECTFPNTGARIMMFIGGPATQGPGMWG 
DELKTPIRSWHDIDKDNAKYVKKGTKHFEALANRAATTGHVIDI 
YACALDQTGIiLEriKCCPNLTGGYMVMGDSFNTSLFKQTFQRVFT 
KDMHGQFKMGFGGTLEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENEIGTGGTCQWKICX5LSPTTTLAIYFEWNQHNAPIPQGG\RG 
A \ I Q F VTQ Y \ QHS SGQRR I R VTT I ARN\ W ADAQTQI QN I AAS FD 
QEAAA I LMARLAI YRAETEEGPDVLR WLDRQLIRLCQKFGE YHK 
DDPSSFRFSETFSLYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRIULM 
DTFFQ IL I YHGET I AQWRKSGYQDMPE YENFRHLLQAP VDDAQE 
ILHSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP I LTDDVSLQVFMDHLKKLAVSS AA 


6104 


124 


732 


iwsju x a x usauai ur rtMJUMPllj v ijv v o Jr IVoAA*\v>V1iKN x WERLLR 
KLPQS R PG FPS PP WG PALAVQ \ AQ PCLQS QQM I P VEVKR I /RS L 
LDS I FWMAAPKNRRTIEVNRCRRRNPQKLI KVKNNI DVCPECGH 
LKQKKVLCAYCYE KVCKETAE IRRQ IGKQEGG PFKAPTIETWL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


983 


PLHGACTSLVLQRFCHRRPRPCAPARPEDMRRPAAVPLLLLLCF 
GSQRAKAATACGRPRMLNRMVGGQDTQEGEWPWQVSIQRNGSHF 
CGGSLI AEQWVLTAAHCFRNTS ETS LYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQS WLQAG V I S WQEG CARQNRPG VY I RVTAHHNWI HR 1 1 P K 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A«Alanine, C-Cysteine, b=Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine. 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


G RP PTAP HTGR P PTAN RGDPRLDLKRG CAR LLTS I E SRGR PAAS 
AGLRR DR CALRRWPLRRAPLARATR RRAGS PRRCAPRPRACPQG 
! WSRARHCPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAICKGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMI FNGGAPNC 
I PCKETCENVDCG PG KKCRMNKKNK PRCVCA PDCSN I TWKGP VC 
GLDGKTVRHECALLKARCKEQPELBVQYQGRCKKTCRDVFCPGS 
STCV\ VDQTNNAYCVTCNR ICPE PAS SEQYLCGNDGVTYS \S AC 
HLR KATCLLG RS I GLA YEGKC I KAKS CE D I QCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKS DEP VCAS DNAT YASECAMKEAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


6107 


623 


16B 


srcssprpbpgrgrgk/lspsehrkwvevfkacdedhkgylsre" 

DFKTAVVMLFGYXPSKIEVDSVMSSINPNTSGILIjEGFLNIVRK 

kkeaqryrnevrhiftafdtyyrgfltledfkkafrqvapklpe 
rtvlevfre v \ drds \ dgh vs f 


6108 


3 


1348 


ggslrfspprvps.csrvfcpvppggcglpspmsasrpqspttpn 

CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 
YYFYDLLVYWIGIFCLASATGLYSCLAPCVRRLP\SASAGESA : 
LLAPTIPNNSLPYFHKRPQARMLLLALFCVAVSWWGVFRNEDQ 
WAWVLQDALG IAFCLYMLXTI RLPTF KACTLLLLVLFL YD I FFV 
FITPFLTKSGSSIMVEVATGPSDSATREKLPMVLKVPRLNSSPL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
A YG VGLL VTFVALALMQRGQ PALL YLVPCTLVTS CAVAL W R REL 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATS P WPAEQS PKSRTSEEMGAGAPMREPGS PAES EGRDQAQPS 
PVTQPGASA 


6109 


1 


13 81 


CRS RAGAASGG AI LEGTKLRRQR VDTNKP LD PL V P S ALRAAML Y 
LEDYLEM1EQLPMDLRDRFTEMREMDLQVQNAMDQLEQRVSEFF 
MNAKKNKPEWREEQMASI KKDY YKALEDADEKVQLANQ I YDLVD 
RHLRKLDQELAKFKMELEADNAGITE I LERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTrDHIPEKKFKSEALLSTLTSDA 
SKENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 
GI \TMAAAQAVQATAQMKEGRRTSSLKAS YEAFKNNDFQLGKEF 
S MARE TVG YS SS S ALMTTLTQNAS S S AADS RSG R KS KNNNKS S S 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNBPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 1 


2464 


ACP5AATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGSGAFSQ 
ARSSSTGSSSSTGGGGQESQPSPLALLAATCSRIESPNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQ 1 1 SSSSGATPTS KEQSG 
SSTNGSNGSESSKNRTVSGGQYWAAAPNLQNQQVLTGLPGVMP 
NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQIQI I PGANQQ 
I ITNRGSGGNI IAAMPNLLOQAVPLQGLANNVLSGQTQWTNVP 
VALNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
bills SAS L VS SQ ASS S S F FTNANS YS TTTTTSNMG I MNFTTSG 
SS GTNS QGQTPQRVSGLQGS DALN I QQNQTSGGSLQAGQQKEGE 
Q\NQQTQAAPKSIuSRPQLVQGG\QALQ\AFQAAPLSGQTFTTQA 
I SQETLQNLQ LQAVPNS GP I I IRT PTVG PNGQ VS WQTLQLQNLQ 
VQNPOAQTITLAPMQGVSLGQTSSSNTTLTPIASAAS I PAGTVT 
VNAAQLSSMPGLQTINLSALGTSGIQVHPIQGLPLAIANAPGDH 
GAQLGLHGAGGDGIHDDTAGGEEGENSPDAQPQAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKQHICHIQGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHLSKHIKTHQNKKGGPGVALSVGTLPLDSGAGSEGSGT 
ATPS ALI TTNMVAMEA I CPEG I ARLANS G I NVKEGGQFCS P INT 
SANGF 
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ID 
NO: 

6111 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - " 
tA^Alanine,. C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Gly C ine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6112 


1*37 
77 


797 
196 


R VDPR VRGAMA P WG KRLAGVRG VLLDI SGVLYDSGAGGG TAI AG 
SVEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDI SEQE 
VTAPAP AACQ I LKERGLR P YLL I H DG V\ AS EFDQ I DTS /S T PNC 

WIADAGESFSYQNKNNAFQVLMELEKPVLISLGKGRYYKETSG 
LMLDVGPYMKALE YACG I KABVGG KPS PEPFKS ALQ A IGVE AHQ 

AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
YVDNLAEAVDLLLQHADK 


! 6113 
6114 


1779 


567 


MSSHKSFKSKRFLAKKOKPNRPILQWIWLKTGNKIRHNWK 
WEG RS W AACG VNLQGAWG ERSG VRAS E AES PG KJRADVS W WS RQL 
ETMVDHLANTE INSQRIAAVESCFGASGQPLALPGRVLLGEGVL 
TKECR KKAKPR I F FLFND I LV YGS I VLNKR K YRS QH 1 1 PLE EVT 
LELLPETLQAKNRWM1KTAKKSFWSAASATERQEWISHIEECV 
RRQLRATGRPA\STEHAAPWIPDKATDICMRCTQTRFSALTRRH 
HCR KCR VWCAECS RQR FL LPRLS P K P VRVCS LCYRELAAQQR K 
EEAE EQG AG VPRAASHLAR P I CGRPVEMTMTPTRTRRAAG^ATG 
PAAWSSTPRGWPGLPSTADPRPAEHLSPSQLHCPGPQEGSSRSC 

PGLRDPIPWWQVQRWGVALSGLPVPFCWTLCPYGFTAGNAFPFR 
KPQNTHRSW 


6115 


818 


245 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCX5LAAGSMCSCSPSWRCTPVPACWPSPPP\PAEQVQC 
GHLPPHADRRALRLPVAAPARG PG PGHPAGPAG PRPARTP PAS D 

HGPGRPTVPAPPCPLLAATEPTPSRPHQRWTRBDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6116 


324 


71 


DVCGRVCAHPHLYTHIHMHI CAHAC \ IHTHAQLC/ 1 TASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 




595 


1430 


TWVMPPGRWHAA/ISSSGPVFEGARA\LQTVKKEEEDESYTPVQ 
AAR PQTLNR PGQEL FRQLFRQLRYHES SG PL ETLS R LRELCR WW 
LRPDVLSKAQILELLVLEQFLSILPGE1iRVWVQLHNPESGEE\L 
WPCWRSCRGTL«GHPGGTRALP\EPRCAIiDGYRS\LRSAQIWSL 
ASPLRSSSALGDHLEPPYEIEARDFLAGQSDTPAAQMPALFPRE 

gcpgdqvtptrsltaqlqetmtfkdvevtfsqdewgwldsaorn 
lyrdvmlenyrnmaslgk 


6117 
6118 


1433 


222 


vu VPS PAPPCS WBVGPGGG WTPG I LKEGQGGRRTPLLLLATRTR 

gllslfppaamhpaafplpvwaavlwgaaptrgliratsdhna 
smdfadlpalfgatlsqeglqgflveahpdnacspiappppapv 
ngsvfiallrrfdcnfdlkvlnaqkagygaawhnvnsnellnm 

VWNSEEIQQQrWIPSVFIGERSSEYLRAIiFVYEKGARVLLVPDN 
TF PLG YYL I P FTG I VGLL VLAMGA VMI ARC I QHRKRLQRNRLT K 

\eqlkqi\pthdyqkgdqydvcaicldeyedgdklrvlpcahay 

HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDB 

GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


~<;ii9 


1044 


247 


STI3CRACTSGATPGAQSHRSARGHAAGGKETAALGMERGKVKK 
KEKEKETQKEKrGEKGREEKVKRKEVEOKTKOEmFifnPDtJirr'v 

EKEEECRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCIVTE 
DSQMEFLE IGGS KPFRS YWEMYLSN/ ADS LARS FS VGFKQDSQP 

ITWKAKKYLHQLIAANPVLPLWFANKQDLEAAYHITDIHEALA 




1217 


462 


DPRFVTENTTKAPAQERTTQPRSSREGTLRSTMEYLSALWPSDL 
LR S VSN IS S EFGRRVW TS AP PPQR P FR VCDHKRT I RKGLTAATR 

QELLAKALETLLLNGVLTLVLEEDGTAVDSBDFFQLLEDDTCLM 
VLQSG QS WS PTRSGVLS YG LGRER PKHS KDIAR FTFD V YKQNP R 
DLFGSLNVKATFYGLYSMSCDFQGL\GPKKVLRELLRWTSTLLQ 
3LGHMLLGI SSTLRHAVEGAEQWQQKGRLHS Y 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H*=Histidine, I = Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y-TyroBine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LE RAGGGG LS SRALVGSGACLS LVARANG KG LPRGR KE FV BAVR 
VRYVAFRYRTPRAVCLRLWSCRREVIMSGRGKQGGKVRAKAKSR 
S S RAG LQ F P VGR VH RLLRKGNYAER VGAGAP VYLAAVLEY LTAE 
I LELAGNAARDNKKTRI IPRHLQLAIRNDEELNKLLGKVTI AQG 
G\VLPNIQAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


RGNGLRAVTPLRPGELLFRSDPLAYTVCKGSRGWCDRCLUSKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRECKCLKSCKPRYPPDS 
VRLLGRWFKLMDGAPSESEKLYSPYDLESNINKLTEDKKEGLR 
QLVMTFQHFMREEIQDASQLPPAFDLFEAFAKVICNSFTICNAE 
MQEVGVGLYPS ISLLNHSCDPNCS I VFNGPHLLLRAVRDIEVGE 
E LT I CYLDMLMTS E ERRKQLRDQ YCFECD \ CFRCQTQD KDADML 
TGDEQVWKEVQESLKKIEELKAHWKWEQVLAMCQAIISSNSERL 
PDINIYQLKVLDCAMDACINLGLLEEALFYGTRTMEPYRIFFPG 
SHPVRGVQVMKVGKIiQLHQGMFPQAMKNLRLAFDIMRVTHGREH 
SLIEDLILLLE/AMRRQHQSILRERSQREIRRVSLLNALLRSHT 
LCFVS CVNLS YWFCFCS VFV 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRKNNPSETSKPSMESGDG ' 
NTGTQTNGLDFQKQPVPVGGAISTAQAQAFLGHLHOVQLAGTSL 
QAAAQSLNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQLMLA 
GGQITGLTLTPAQQQLLLQQAQAQAQliLAAAVQQHSASQQHSAA 
GATISASAATPMTQIPLSQPIQIAQDU3QLQQLQQQNLNLQQFV 
LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 
LLQSQPRI\TLTSQPATPTCTIAATPIQTLPQSQSTPKRIDTPS 
LEE P \ SDLE ELEQ FAKT FKQRR I KLG FT \QGD AGLAMVKL YGND 
FSPTTIFRFEALNLSPKNMCKLKPLLEKWLNDAENLSSDSSLSS 
PS ALNS PGI EGLS RRRKKRTS IEA\ N I RVALEKSFLEN\ QKPTS 
EEITMIADQLNMEKGVIRWFCNRRQKEKRINPPSSGG\TSSSP 
I KAI F PS PTSLVATTPSLVTSS AATTLTVSP VLPLTSAAVTNLS 
VTGTS DTTSNNTATVI STAP PASSAVTS PSLSPS PSASASTSEA 
SSAS ETS TTQTTS T PLS S P LGTS QVMVTASGLQTA/ AQLL PFKG 
AAQLPANASLAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLATIQALASGGSLP I TS LDATGNL VFANAGGA 
PNI VTAPLFLNPQNLSLLTSN PVSLVS AAAASAGNSAPVAS LHA 
TSTSAES IQNS L FTVAS ASGAASTTTTAS KAQ 


4123 - 


3 


2944 


H LLHRW FGTDMQM I NFTTG E FQ LTE AC P YLGTH S EESRFG 1 LHL 
HLQPLEMKRVGWFTPADYGKVTSLILIRNNLTVIDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQILSIT 
KNFKVEN I G PL P I TVS S LK ING YNCQG YG FE V£DCHQFS LDPNT 
SRDISIVFTPDFTSSWVIRDLSLVTAADLEFRFTLNVTLPHHLL 
PLCADWPGPSWEESFMRLTVFFVSLSLLGVILIAFQQAQYILM 
BFMKTRQ RQNAS S S S QQNNGPMD VI S PHS YKSNC KNF LDT YG PS 
DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 
HKTSTAAAS STS TTTEE KQTS PLGSSLPAAKEDI CTDAMRENWI 
SLRYASGINVNLQKNLTLPKNLLNKEENTLKNTIVFSNPSSECS 
MKEGIQTCM FPKETDI KTS ENTAEFKERELCPLKTS KKLPENHL 
PRNS PQYHQ PDL PE I S R KNNGNNQQ VP VKNEVDHCENLKKVDT K 
PSSEKKIH KTS RE DM FSEKQD I P FVEQED PYRKKKLQ E KREGNL 
QNLNWSKSRTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSEIiS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDSVSQNDFPSEAPISI^LSHNICNPMTGNSLPQY 
AE PS CPSLPAGPTG VEEDKGLYS PGDLWPTP PVCVTSSLNCTLE 
NGVPCVIQESAPVHNSFIDWSATCEGQFSSAYCPLBLNDYNAFP 
EENMNYANGFPCPADVQTDFIDHNSQSTWNTPP\NMPAS\WGNA 
QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


D ^y«"Diiu tuacaimng signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine. 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, (^Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6124 






PTTEHS D / THMEN Q A\ WCKE YY PG F \ N P FRA YMN LDI W TTT \ A 
NRNANFPLSRDSSYCGNV 


6125 


1S73 


236~ 


SDEALRLAUKRGMGRVQLFEISLSHGRWYSPGEPLAGTVRVIU, 

GAPL P FRA I RVTC I GSCG VSNKANDTAWWEEGY FNS S LS LADK 

G S L PAGEHS FP FQ FLLPATAPTS FEG P FG KI VHQ VRAAI KT P R F 

SKDHKCSLVFYILS PLNLNS I PDIEQPNVASATKKFSYKLVKTG 

SWLTASTDLRGYWGQALQLHADVENQSGKDTSPWASLLQKV 

SYKAKRWIHDVRTIAEVEGAGVKAWRRAQWHEQILVPALPQSAL 

ruuauiniux i1»UVS1jKAPEATVTLPVFIGNIAV/NPCPSEPPA 

R PGAASWGPTPGG \ PSAP PQEEAEAEAAAGGPHFLDP VFLSTKS 

HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 

PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSOGGVE 
PSLTPES 




i 1 


904 


K.TCPKJjTCAFTVSVPDSCCRVCRGDGELSNEHSDGDIFRQPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
x v vi v arwKH KHGUVCVSNGKT YSHGES WHPNLRAFG I VECVLC 
TCNVTKQBCKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQI S QMCSS R VCRTEI*EDLVKVLYLERSEKGHC 


6126 
6127 


1224 


389 


RLLSEAPCpR5RRRFQMNPEWGQAF\mVAVAGGLCAVAVFTGiF 
DS VS VQVG YEH YAEAP VAGLPAFLAM P FNS L VNMA Y TLLGLS WL 

MK^^Ai^KiiXiPRYLKDVFAAmLLYGPVQWLRLWTQWRRAAVLDO 
WLTLP I FAW P VA WCLYLDRGWRP \ WLFLSLECVSLAS YGLALLH 
PQG FEVALGAHVVPAVGQAI*RT\HRHYG/S ATPSATYLALGVLS 
CLG FWL KLCDHQLARWRLFQCLTGHFWS K VCD VLQ FHFAFLFL 
THFNTHPRFHPSGGKTR 




1335 


463 


VLPRRCLVFWNTMDSSREPTLGRLDAAGFWQVWQRFDADEKGY 
IEEKELDAFFLHMLMKLGTDDTVMKANLHKVKQQFMTTQDASKD 
GRIRMKELAGMFLSEDENFLLLFRRENPLDSSVEFMQIWRKYDA 
DSSGFISAAKLRNFLRDLFLHHKKArSEAKLEEYTGTMMKIFDR 
NKDGRLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKIFA 
YYDVS KTGALEG P \ E VDG F VKDMMEL VQPS IS G VDLDKFRE I LL 
RHCDVNKDQKIQKSELALCLGLKINP ; 


6123 


2511 


843 


T^MSRRQLKRWVWSSG^VQARGRNVRAPRLGKIAMGLBMSSiO> 

SPGSLDGRAWRn&OFf DOC &MrYJr , nvTinin/Ti m 0 /, nn . tn«««^— 

w *r\j^jji^^Air(E,iyAy AWtAaGRKTRVYATS S RRAP PSEGTRR 
GGAARPEKTAEBGPPAAPGSLRHSGPLGPHACPTALPEPQVTSA 
MS SQ WG IEPLY I KAE PAS P D S P KGS S ETETEP P VALAPG \ PAP 

TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
GYHYGVASCEACKAFFKRTIQGSIEYSCPASNECEITKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPIiPFPGP 
F PAG PLAVAGG PR KTAAPVNALVSHLLWE PE KLY AMPDPAG PD 
GHLPAVATLCDLFDREIWTISWAKSIPGFSSLSLSDQMSVLQS 
VWMEVLVLGVAQRSLTLQDELAFAEYLVLDEEGARPAGLGELG\ 
AAliLQL VRRLQALRLERE K Y VLLKALALANS DS VHI EDEPRL WS 

SCEKLLHEALLEYEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 
KVLAHFYGVKLEGKVPMHRXFLEMLEAMMD 


6129 
6130 


1764 
3 


771 

577 ( 


ARFARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLPICAQCGKtKCMM 
KSSDCVIKHAGVYSTGLAMVGAICDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQ VLEAET F KCVS CNRDGQHS CLRCKACFCDDHTRS KVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
srWKNLSSDXYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQEN 

3rggtmreykvwlgsgVgvgksaltv\qfvtctfiekydptie 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepticte*" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F-Phenvlalanine G-Glwin* 
HaHistidine. I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /..possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFYRKEIEV\DSSPSVAGISWTQQGTEQF\ASMRDLYIKKGQGC 

ilvyslvnqqsfqNdikpmrdqiirvkvsekvpviXlvgnXsvd 

LESEREVSSSEGRALAEEWGCPFMETSAKSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGSPRHLPSCSPALLLLVLGGCLGVF 
v» w rtrv^ a mv vuxju i uusjutu v iibutl l irJjKKTKALIGEMGMTFS 

SAYVPSALCCPSRASILTGKYPHNHHWNKTLEGNCSSKSWQKI 
QE PNT FP A I LRSMCG YQTF F \ AG KYLNE YGAPDAGGLEHVPLGW 
SYWYALEKNSKYYNYTLSIKGKARKHGENYSVDYLTDVLANVSL 
DFLDYKSWFEPFFMMTATP\APHSPWTAAPQYQKAFQNVFAPRN 
KNFN I HGTNKHWLI RQAKTPMTNS S I Q FLDNA FRKRWQTLLS VD 
DLVEKLVKRLEFTGELNNTYI FYTSDNGYHTGQFSLP IDKRQLY 
EFD I KVPLLVRGPG I KPNQTS KMLVAN I DLG PTIUDI AGYDLNK 
TQMDGMS LL P I LRGASNLTWRS DVLVEYQGEGRNVTDPTCPSLS 

PGVSOCFPDr*Vr*Fr>AYM?JTVai^\rD , PMCnT 1.TMT nvrmnnnAniiFi.. 

^ ** v^-* v v-DUrt i hh i x/iv. vK l TmoAIjWN JL»Q i CEFDDQEVFV 
BVYNLTADPDQITNIAKTIDPELLGKMNYRLMMLQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 


6132 


96 


1241 


AAGliLPPGLVPEDPRRTRNLLPFGIQGPPFALSRPbFSCVESGW 
AWEAMEPEFLYDLLQLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRLAALK1jEAEDIALTATSQKHKLTVVLEAVNRS\CSWRSGRP 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVBVITI 

ESTKSGLKS e KLVEQLTE ystdkdep pkdvfdelfklape kvna 
vkeaivnfvnqkldrlglsvqnldtqfadgvilllligqlegff 
lhlkefyltpnspaemlhnvtiale'll/igrgpaqlpc/lalk/ 
tivnkdakstlrvlyglfckhtqkahrdrtphgapn 


6133 


2 


4256 

r 


fvhgsmadtddfmeceeeelepwqkisdviedswedynsvdkt 

TTVSVSCKJPVSAPVPIAAHASVAGHLSTSrrVSSSGAONSDSTK 

ktlvtliannnagnplvqqggqpliltqnpapglgtmvtqpvlr 
pvqvmqnanhvtsspvasqpifittqgfpvrnvrpvqnamnqvg 
ivlnvqqgqtvrpitlvpapgtqfvkptvgvpqvfsqmtpvrpg 

STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 

TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPSPPAVSIASFVT 

VKRPGVTGENSNEVAKLVNTLNTI PS LGQSPGPVWSNNSSAH\ 

GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 

RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 

/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 

GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 

LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 

KI CEWAFES E PLFLQHMKDTH KPGEMP YVCQ VCQ YR S S LYS EVD 

VHFRMIHEDTRHLLCPYCI>KVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 

SRGQPRTVP VS SNDT PPS ALQ EAAPLTS S MDP LP VFL YP P VQRS 

IQKRAVRKMS VMGRQTCLECS FE I PDFPNHFPTYVHCSLCR YST 

CCSRA YANHM I NNHVPRKS PKY LALFKNS VS G I KLACTS CTFVT 

SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 

NVKNMYPPPSFPTNKAATVKSAGATPAEPEEIiLTPLAPALPSPA 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 

LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 

RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 

VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 

VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 

DTE VLS S DDR KENALQTVGTGE P WCDVVLAI LADGTVLPTL VF Y 

RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 

RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAVVPAGCSSKIQPL 



452 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 

WO: 


Predicted 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
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Predicted end 
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location 
corresponding 
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Amino acid segment containing signal peptide 
(A*»Alanine, C=Cysteine f D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine G-Glvrinp 
H^Histidine, I*Isoleucine, K=Lysine, 
L«Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V« Valine, 
WaTryptophan, Y=Tyrosine, X= Unknown, /-Stop 
Codon, /-possible nucleotide deletion, 
\=poosible nucleotide insertion) 








D VC I KRT V KN FLHK KW KEQAREMADTACDSD VLLQLVLVWLGE V 
1X3 V I G DC P EL VQRS FL VAS VLPGP DGN INS PTRN ADMQ E ELI AS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6131 


2 


42S6" 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAriASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLILTQNPAPGLGTMVTQPVLR 
P VQVMQN ANHVTSS PVAS OP I F ITTQG FPVRN VRP VQNAMNQ VG 
IVLNVQGCTTVRPITLVPAPGTOFVKPTVGVPQVFSQMTPVRPG 
STM PVR PTTNT FTTV I P ATLTI RS T VPQSQSQQTKS TPS TSTTP 
TATQPTSIiGQLAVQSPGQSNQTTNPKLAPSFPSPPAVSIASFVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GS QRT SG P ES S M KVTS S I P V FDLQDGGRKI CPR CN AQFRVTE AL 
RGHMCYCCPEMVEYQKKGKSLDSBPSVPSAAKPPSPEKTAPVAS 
/THPS3TPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVA0LTNFPKVATSFRCPHCTKRLKNNIRFMNHMKH1JVE 
LDQQNGEVDGHTI CQHCY RQ FSTPFQLQCHLENVHS PYBSTTKC 
KICE W AFES E PL FLQHMXDTH KPGEM P YVCQVCQ YRSS L YS E VD 
VHFR M I HEDTRHLLCP YCLXVFKNGNAFQQHYMRHQKR \NVYH\ 
CNKCR VQ FL FAKDKI EHKLQHHKTFRKP KQLEGLKPGTKVT I RA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VMGRQTCLECS FE IPDFPNHFPTYVHCSLCRYST 
CCS RAYANHM INNHVPR KS P KYLALF KNS VSG I KLACTSCT F VT 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 

STATPPPTPTHPOAT.IXTiDDT aTlTPSCPT mmnnnrn n nt rmnn nn 
* r i tr mtr\jnunutr 1 ooAliLLiNVUDQDEGSPVTQE PS 

LASGGGGSGGVGKKEQLSVKKLRWLFALCCKTEQAAEHFRNPQ 
RR IRRWLRRFQAS QGENLEGKYLS FEAE EKLAEWVLTQ REQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLS S DDRKENALQTVGTGEP WCDWLAILADGTVLPTLVFY 
RGQMDQ PANMPDS I LLEAKESG Y S DDE I MELWSTRVWQ KHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
DVC I KRTVKNFLH KKW KEQ AREMADTACDS DVLLQLVLVW LG E V 
LG VI GDC P ELVQRS FLVAS VL PG PDGN I NS PTRNADMQEE LI AS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6135 


2 


4256 


FVHGSMADTDLFMECEEEELEPV/QKISDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLI LTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTSSPVASQP I FI TTQG FPVRNVRPVQNAMNQVG 
I VLNVQQGQTVRP I TLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS I AS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVVVSNNSSAH\ 
GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTP I PALS PPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNF PKVAT S FRC PHCTKRLKNN I RFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLEIIVHSPYESrrKC 
KI CE WAFE S E PLFLQHMKDTHKPO EMP YVCQVCQ YRSS L YS E VD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
CCSRAYANHMIKNHVPRKSPKYLALFKNSVSG I KLACTSCT FVT 
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corresponding 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, e« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








S VGDAMAKHLVFNPSHRSSS I LPRGLTWI AHSRHGQTRDRVHDR 
NV KNMY P P P3 FPTNKAAXVKS AGATPAE PEE LLT P LAPAL PS PA 
STATPPPTPTHPOALALPPIATEGAECLNVDDQDEGSPVTOEPE 
IiASGGGGSGG VG KKEQLS V K KLRWLFALCCNTEQAAEH FRJf PQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHI,TPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLSSDDRKE^ALQTVGTGEPWCDVVIAIIADGTVLPTijVFY 
RGQMDQPAKMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6136 


1704 


539 


FG VRMALEGMS KR KR KR S VQEGEN P DDG VRGS P P ED YRLGQ VAS 
SLFRGEHHSRGGTORLASLFSSI.EPQIQPVYVPVPKNESAIASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKILDDTEDTWSQRKKIQ 
INQEEERLKNERTVFVGNLPVTCNKKKLKSFFKEYGQIESVRFR 
SLIPAEGTLSKKLAAIKRKIHPDQKNINAYWFKEESAATQALK 
RNG AQ I ADG FRI R VDLAS ETS SRDKRS VF VGNL P YKVE ES AI EK 
HFLDCGSIMAVRIVRDKMTGIGKGFGYVLPBNTDSVHLALKLNN 
SELMGRKLRVMRSVNKEKFKQQNSNPRLKKVSKPKQGLNFTSKT 
AEGHPKSLFIGEKAVLLKTKKKGQKXSGRPKKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG " 

MWNML I VAMCLA\ LLGLPGKAQELQGHVS \ I I LAGEQLGDLAKK 

YLWQG \LFQLYLDEAGRGHS FSFHGAALTAPKQGQELMAKALES 

LSCPKDMAPSHCAEHKDQFLQLSQYRQLKTAEDYQALNKDIEAQ 

LQHAGLREAGGIFYFSVPPFAYEDIARNINSSCRPGPGAWLRW 

LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQIL 

PFRDQNRKALDGLWNRHHVERVEIIMkETVDAEGRTSFYEEYGV 

I RD VLQNHLTBVLTLVAME L PHMVSSAEAVLRHKLQVFQALRGL 

ORGSAWGQYQS YS EQVRRELQKPDSFHSLTPTFAGVbVH IDNL 

RWEGVPFILKSGKALDERVGYARILFKNQACCVQSEKHWAAAQS 

QCLPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPG 

LRLFGSPLSDYYAYSPVRERDAHSVLLSHIFHGRKNFFITTENL 

LASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 

QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEEL1SKL 

ANDIEATAVRAVRRFGQFtrLALSGGSSPVALFQQLATAHYGFPH 

AHTHLWLVD ERC VPL3 DP E SNFQGIXJAHLLQHVR I P Y YNI H \ AM 

PVHLQQRLCAEEDQGAHI YARE I S ALGANS S FDLVLLGMGADGH 

TASLFPQSPTGLDGEQLWLTTSPSQPHRRMSLSLPLINRAKKV 

AVLVMGRMKREITTLVSRVGHEPKKWP1SGVLPHSGQLVWYMDY 
DAFLG 


6138 


45B7 


934 


Kh'SKJjTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENliFRFL 
TDTSHLLSAVKGQERFSLYQTRSLIHELKWKEIHFQRRRTTCAL 
TLEAGEKLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQSTVETWDQCEKKIKELKSRLQVbKAQSEDPLPELHEDLHNEK 
ELI KELEQSLASWTQNLKELQTMKADLTRHVLVEDVMVLKEQI E 
HLHRQWEDLCIaRVA I R KQEIEDRLNTWWFWEKNKELCAWLVQM 
ENKVLQTADISIEEMIEKLQKDCMEEINLFSENKLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETFAFI 
QQLDKNMSNLRTWLARIESELSKPWYDVCDDQEIQFCRLAEOQD 
LQRDIEQHSAGVESVFNrCDVLLHDSDACANETECDSIQQTTRS 
LDRRWRNICAMSMERRMKIEETWRLWQKFLDDYSRFEDWLKSAB 
RTAACPNS S EVLYTSAKEELKR FE AFQRQI HER LTQLELI NKQY 
RRLARENRTDTASRLKQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Ao Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycirie, 
H=Hi3tidine, I=Isoleucine, K=Lysine, 
L=Leucine» M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine f V=Valine, 
W»Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NQREEPEGTRBS I LVWLTEMDLQLTNVEHFS ES DADDKMRQLNG 
FQQEITLNTNKIDQLXVFGEQLIQKSEP\LDAVLIEDBLEELHR 
YCQEVFG RVSRFHRRLTSCTPGLEDE KEASENETDM EDPRE I QT 
DSWRKRGESEEPSS PQSLCHLVAPGKERSGCETPVS VDS \ I PLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLLPPGTDGG 
KEGPRVLNGNPQQEDGGLAGITEQQSGAFDRWEMIQAQEL\HNK 
LKIKQNLQQLNSDISAITTWLKKTEABLEMLKMAKPPSDIOEIE 
LRVKRLQE I LKAFDT YKAL WS VNVSS KE FLQTES PES TELQSR 
LRQ LS LLWEAAQG AVD S WRGGLRQS LMQ CQD FHQLSQNbLLWIiA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQEISNSLLI KGHGEDC I EAE E KVHVI \ E KKLKQLREQVSQDLM 
ALQGTQNPASPLPSFDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
G EE ETES RVPGSTR PQRS FLS R WRAALPLQ LLLLLLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERRPWEASSWKTL/LAGWIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTLSCIRWYRRESMFGFFKGMSFPLASIAVYNSWFGVFSN 
TQRFIiSQHRCXSEPEAS PPRTLSDLLLASMVAG WSVGLGG P VDL 
IKIRLQMQTPPVSGRQPRFEVQGSGSCG\EPAYQGPVHCITTIV 
RNEGLAGLYRGASAMLLRDVPG YCLYFI P YVFLSEWI TPEACTG 
PS PCAVWLAGGMAGAI S WGTATPMDWKSRLQADGVYLNKYKGV 
LDCISQSYQKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


694 


136 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVRAQPLSVTVWAP 
RCQR P/Q PPAPEPSSPNAAVPEAI PTPRAAASAALELPLGPAPV 
S VAPQAEAEARSTPG PAGS RLG PET FRQR FRQFR YQDAAGPREA 
FRQLREL / S PRQWLR PD I \ RTKEQ\ I VEMLVQEQLLAI LP EAAR 
ARRIRRRTDVRITG 


6141 


2 


984 


AQVG PRSRPCKM PL KLRGKKKAKS KETAGLVEGE PTGAGGGS bS 
ASRAPARRLVFHAQLAHGSATGR VEG FSS I QEL YAQIAGA FE IS 
PSEILYCTLNTPKIDMERLLGGQLGLEDFIFAHVKGIEKEVNVY 
KS E D S LGLT I TDNG VG YAF I KR I KDGG VTDS VKTI CVG DH I E S I 
NGEN I VGWRHYDVAKKLKE LKKE ELFTMKLI E P KKAFE I E LRS K 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLELYMGIRDIDLATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DEF V FDVWG VI GDAKRRGL 


6142 


116 


602 


EAEGEQVCXjAKCCGDAPHVENREEETAR IGPG VMES KEERALNN 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FRVRQPILQYRWDIMHRLGEPQARMREENMERIGEEVRQLMEKL 
REKQLSHSLRAVSTDPPHHDHHDEFC\LMP 


6143 


2802 


270 


FRMRI FLHCPWNQQMWKIWNLLETSLESCKAHLS IQKLLKER\q 
\QLPVFKHRDSIVETLKRHRWWAGET\GSGKSTQVPHFLLED 
LLLNEWEASKCNIVCTQPRRISAVSLANRVCDELGCENGPGGRN 

FIVDEV\HER\SVQSDFLLIILKEILQKRSDLHLILMSATVDSE 
KFSTYFTHCPILRISGRSYPVEVFHLEDIIEETGFVLEKDSEYC 
Q KFLE E E S EVT I NVTS KAGG I KK YQE Y I P VQTGAHADLNP F YQK 
YSSRTQHAILYMNPHKINLDLILELLAYLDKSPQFRUIEGAVLI 
FLPGLAHIQQLYDLLSNDRRFYSBRYKVIAliHSILSTQDQAAAF 
TLPPPGVRKIVLATNIAETGITIPDWFVIDTGRTKENKYHESS 
QMSSLVETFVSKASALQRQGRAGRVRDGFCFRMYTRERFEGFMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDFLSKALDPPQLQVISN 
AMNLLRKIGACELNEPKLTPLGQHLAALPVNVKIGKMLI FGAI F 
GCLDPVATLAAVMTEKSPFTTPIGRKDEADLAKSALAMADSDHL 
TIYNAYLGWKKARQEGGYRSEITYCRRNFLNRTSLLTLEDVKQE 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E=> 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K^Lysme, 
L= Leucine, M=Methlonine, N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








l i klvkaagfsssttsts wegnrasqtls fqei allkavlvagl 
vdnvgki i ytks vdvts klac i vstaqg kaqvh p s s vnrdlqth 
gwllyqeki r yarvylrbttl i tpfpvllfggdi evqhrerlls 
idgwiyfqapvkiavifkqlrvlidsvlrkkLenpkmslendki 

LQIITBIiIKTENN 


6144 


1289 
» 


568 


sgpgsmsgqrvdvkvvmlgkeyvgktslveryvhdrflvgpyqn 
vsasggarhggrgsggpvictygpdlfpi,va\tigaafvakvms 
vgdrtvtlgiwdtagseryeamsriyyrgakaaivcydltdsss 
ferakpwvkelrsleegcqiylcgtksdlleedrrrrrvdfhdv 
qdyadnikaqlfetssktgqsvdelfqkvaedyvsvaafqvmtb 

DKGVDLGQKPNPYFYS CCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRbSSPVPAVCRKEPCVLGVDEAGRGPVIi 
GPMVYAICYCPLPRIiADLEAIjKVADSKTLLESERERLFAKMEDT 

dfvgwaldvlspnlistsmlgrvkynlnslshdtatgliqyald 
qgvnvtqvfvdtvgm pe tyqar lqqsfpgi evtvkakadalypv 

\VSAASICAKVARDG^VKKWQFVEKLQDLDTDYG\SGYPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
ED5AS ENQEGLRKI TS YFLNEGSQARPRSSHRYFLERGLES TTS 
L 


6146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVIjTDEQKS 
R / YPGHEAHDQGG \ WDARQS I IRKWDPETGRTRL I KGDGEVLE 
E I VTKERHRE INKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
ETDMVRQIRALDSDMQTLVYENYNKFISATDTIRKMKNDFRKME 
DEMDRLATNMAVITDFSARISATLQDRHERITKLAGVHALLRKL 
Q FL FE LPS RLTKC VELGAYGQAVR YQGRAQAVLOQ YQHLPS FRA 
IQDDCQVITARliAQQLRQRFREGGSGAPEQAECVELLLALGEPA 
EELCEEFLAHARGRLEKELRNLEAELGPSPPAPDVLEFTDHG\S 
SGFVGGLCQVAAAYQELFAAQGPAGAEKLAAFARQLGSRYFALV 
ERRLAQEQGGGDNSLLVRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERIjGHHLQGLRAAFLGCLTDVRQALAAPRVAG keg p 
GLAELLANVASSILSHIKASLAAVHLFTAKEVSFSNKPYFRGEF 
CSQGVREGLI VGFVHSMCQTAQSFCDS PGEKGGATP PALLLLLS 
RLCLDYETATISYILTLTDEQFLVQDQFPVTPVSTLCAEARETA 
RRLLTHYVKVQGLVISQMLRKSVETRDWLSTLBPRNVRAVMKRV 
VEDTTAIDVQVLPRLAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMCIWASHGASSVARASVREPQGNKSPRMNTKRAGECLCPRS 
CSFSAQDYDIFAPILPVEKQRLRVTQEVRAGLVLVLKIRPQTNS 
CILPLPHSTGSINSDHVPTK 


6148 


305* 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDLDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPLALPVRT 
SYKAFSTRWNWNEWLKLPVKYPDLPRNAQVALTIWDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSQMDQKPTKTPGRT 

S STIjS E DOM I JV If T ,T V RH R Cid UM\TKVT>W T .flR T ,TI? D P T PM T NT7 Q 

VKRSSNFMYLMGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
L KN I VS Y PPS KP PT YEEQDLVWE FRYYLTNQDKALT KI LTS V I W 
DLPQGAKQALALLGKWKPMDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQI1T/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCrFLISRASKNSTLANYLYWYVIVECEDQDTQQRDPK 
THE^LNVMRRFSQALLKGDKSVRVmSLI^QQTFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKKNLSDVELIPLPLEPQVK 
IRGIIPETATLFKSALMPAQLFFKTEDGGKYPVIFKHGDDLRQD 
QLI LQ 1 1 S LMDKLLRKENLDLKLTP YKVLATSTKHGFMQF IQS V 
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SEQ 
ID 

NO: 


" Predicted 
beginning ■ 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, i 
L=Leucine, M=Methionine, N=Asparagine, 1 
P=Proline, Q^Glutamine, R=Arginine, 1 
S^Serine, T=Threonine, V^Valine, ] 
W=Tryptophan, Y=Tyrosine, X-Unknovm, *«Stop 
Codon # /^possible nucleotide deletion, 
\=possible nucleotide insertion) ! 








P VAEVLDT EGS I QN F FR K YA PS ENG PNG I S AEVNDT YVXSCAGY j 
CVITY I LG VGDRHLDNLLLTKTG KLFH I D FG YI LGRD PKPL P P P 
M KLNKEM VEGMGGTQS EQ YQE FRKQC YTA FLHLRR YS NL I LNLF 
S LMVD AN I PDI ALE PD KTVKKVQDKFRLDLSDEE AVH YMQSL I D 
ES VHALFAAWEQ I HKFAQ YWR K 


S14ST- 


l 


1413 


RVDPRVRENGTANPI KNGKTSPASKDQRTGKKTSVQGQVQKGNdH 

ESESDFESDPPSPKSSEEEEQDDEEVLQGEQGDFNDDDTEPENL 
GHRPLIxMDSEDEEEEEICH <: ; c ?r) < ;r)YPnaifax , vcnwcc!xrvDrir>c»r»n 1 

GPTQDLNTILLTSAOLSSDVAVETPKQEFDVFGAVPFFAVRAQQ 
PQQEKNEKNLPQHRFPAAGLEQEEFDVFTKAPFSFCKVNVQECHA 
VG PE AHT I PG Y PKS VDVPGST P FQP FLTSTS KS ESNE DL FG LVP 
FDEITGSQQQXVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
STKKTLKPTYRTPERARRHKKVGRRDSQSSNEFLTISDSKENIS 
VALTDGKDRGNVLQPEESLLDPFGAKPFHSPD\LSWHPP\HQGL 
S\DIRADHNT\VLPGR\PRQNSLHGSFHSADVLKMDDFGAVP /F 
LTELWOS ITPHOSOOSOPV\ ET.nDPrcaADPDG vr» 1 


6150 


372 


37 


MSNIKKYI IDYDWKAS I E I DHDVMT*EEKLHQINNFWSDSEYR 
LN KHG S VLNAVLIMLAQHAI*L 1 A I S S DLNAYG WCE FD WNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF | 


6151 


1555 


521 


dsnqqsvsgtaastllhsfkatiyyo/stghvqqfygvtspysqtH 

TP P I VQS YAQPSLQYIQGQQI FTAHPQG VWQPAAAVTT I VA PG 

OPOPLOPSEMWTTJNT.T.nT.PDDQDDlTDVT'TiJT r>nkTT.irrrn»-r»T\T»T^^» 1 
^* "v rJt " «▼ v j.v*v*uLiUUc ifK isr'A'jvr'Al J VIjfPNWKTARDPEG I 

KI YYYHV I TRQTQWDP PT W ES PGDD ASLEHE AE MDLGTPT YDEN 

PMK\ASKKPKTAEADTSSELAKKSKEVFRXEMSQFIVQCLNPYR 

KPDCKVG \ R I TTTEDFKHLARKLTHG VMNKEL KYCKN PE \ DL EC 

NENVKHKTKEY I KKYMQKFGAVYKPKEDTEFRVTVGPGWEDG WS 

G KTDS R ERKS CG P FCS TP VS TVLLM I HHPGB FNPADVN j 


6152 


1366 


648 


nrtwstpstwmgvalpplcstgpwpvtrqitartttoavpMcpH 

PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
GPSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 
KCEHHCPCDPKTGNCSVSRVKQCLQPPEATLRAGELSFFTRTAW 
LALTLALAFLLLISTAANLSLLLSRAERNRRLHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD 1 


6153 


2 


3368 


grvgarspgrayalllllicf^gsglhlqvlstrnenkllpkhH 

PHLVRQKRAWITAPVALLEGEDLSKKNPIAKIHSDLAEERGLKI 

tykytgkgiteppfgifvfnkdtgelnvtsildreetpfflltg 
yaldargnnvekplelrikvldindnepvftqdvfvgsveelsa 
ahtlvmkinatdadepntlnskisyrivslepayppvfylnkdt 
geiyttsvtldreehssytltveardgngevtdkpvkqaqvqir 
ildvndnipwenkvlegmveenqvnvevtrikvfdadeigsdn 

WLANFTFASCTJEGGYFHIETDAQTNEGIVTIiIKEVDYEEMKNLD 

fsvivankaafhksirskykptpipikvkvknvkegihfkssvi 
siyvsesmdrsskgqiignfqafdedtglpaharyvkledrdnw 
isvdsvtseiklaklpdfesryvqngtytvkivaisedyprkti 
tgtvlinvedindncptliepvqtichdaeyvnvtaedldghpn 
sgpfsfsvidkppgmaekwkiarqestsvllqqsekklgrseiq 
flisdnqgfscpekqvltltvcevlhgs \gcreaqhds yvglgp 
aaialmilaflllllvpllllmchcgkgakgftpipgtiemlhp 

WNNEGAPPEDKWPSFIjPVDQGGSLVGRNGVGGMAKEATMKGSS 

sasivkgqhemsemdgrweehrsllsgratqftgatgai \mtte 
tt i taratg as rd vag aq aaav aln ee flknyftd kaas yteed 
enhtakdcllvysqeeteslnasigccsfiegelddrflddlgl 
kfktlaevclgqkidinkeieqrqkpatetsmntashslceqtm 
vnsentyssgssfpvpkslqeanaeicvtqeivtersvssrqaqk 
vatplpdpmasrnviatetsyvtgstmppttvilgpsqpqsliv 
tervyapastlvdqpyanegtvwterviqphgggsnplegtqh I 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment, containina sianal t^ot-»i-4*5»" 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown\ *=Stop 
Codon, /apos Bible nucleotide deletion, 
\=possible nucleotide insertion) 








WDVPYVMVRERSSFLAPSSGVQPTIiAMPNIAVGQNVTVTSRVI, 
A PASTLQS S YQ I PTENS MTARNTTVS GAGVPG PL PD FGL E SSGH 
SNSTITTSSTRVTKHSTVQHSYS 


6154 


3660 


214* 


KKKTKMKN7LQKTVNFGAWPKPTI SDKSHLLQMVSKLDIiTDAKN 
SDTAH I KS I B I TS I LNG LQASES S AE DSEQBDERGAQDMDNNG K 
EESKIDHLTNNRNDLISKEBQNSSSLLEENKVHADLVISKPVSK 
SPERLRKD3EVLSEDTDYEEDEVTKKRKDVKKDTTDKSSKPQIK 
RGKRR YCNTBECLKTGS PGKKEEKAKNKESLCMENSSNSS SDED 
EEETKAKMTPTKKYNGLEEKRK<?l.T?TTT:PVQr;i; , G». , v/?vt7irTj tvtt 

NNS DERLQNSRAKDRKDVWS S IQGQW P KKTLKELFSDS DTEAAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 
EEKTVEVNDRKAEFPSSGSNFSA* IPLPYLHLNRLHQSL *QKGS 
RQQSSVTVSEPLAPNQEEVRSIKSETDSTIEVDSVAGBLQDLQS 
ERE* LASRF*CQCELEQ* *SARTRTS * KSLYRSEKSERCSGRRK 
FIKKAEFO<P*SNSGKQQKEGK 


6155 


669 


121 


HLLPELRGKS WITMKY VF YLGVLAGTFFFADS S VQKEDPA P YGV 
YLKSH FNPCVGVLI KPSWVLAPAHCYLPNLKVMLGNFKSRVRDG 
TEQT IN P IQI VRY WNYS HS APQDDLML I KLAK PAMLNPK VQALN 
P\ PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMS DRE 
CQKTEQGKSHRNS LCVKFVKVFSR I FGEVAVATVI CKDKLQG IE 
VGHFMGGDVG I YTNVY KYVS W IENTAKDK 


6156 


5725 


3984 


GTST VTMATKKHFS I ILNLLGMLLKKDNQDTRKLLMTWAIiEVAV 
vrcruvofci XHVlit uLiFdrrlKr UKGLIiADTLVEDyNICLQACSSLH 
ALSSSLPDDLLQRCVDVCRVQLVHRGTCIRQAFGKLLKSIPLGV \ 
FLSNNNHTEIQEISLALRSHMSKAPSNTFHPQDFSD/VISFILY ! 
GNSHRTGkmWLERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW ! 
AI WE AAQ FTVLS KLRTP LGRAQDTFQT I EG 1 1 RS LAGHTLN P DQ : 

ALTSP PKVIRTFLYTNRQ7CQDWLTRI RLS IMRVGLLAGQPAVT I 
VRHGFDLLTEMKTTSLSQGNELEVS IMMWEALCEbHCPEAIQG 
I AVWS SS I VGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCC I S S FDKS VLTLAS AGCKS ASLKHC LNGES R KS VI*S KPTDS S 
PEVINYLGNKACECYISTADWAAVQEWQNAIHDLKKSTSSTSLN 
LKADFNYIKSLSSFESGKFVECTEQLELLPGENINLLAGGSKEK 
IDMKKLLRNM 


6157 


. 946 


329 


MANRGPS YGLSRE VOEKI EOKYDADIjFNKr iVnw T T i/irn v n t t?h 
PPPGRAHFQKWLMDGTVLCKliINSLYPPGQEPIPKISESKMAFK 
OMEQI SQFLKAAETYGVRTTD I FQTVDLWEGKDMAAVQRTLMAL 
G S VAVTKDDGCYRGE PSWFHRKAQQNRRG FS EEQLRQGQNVI G L 
QMGSNKGASQAGMTG YGMPRQ I M * DAASCP 


6158 


441 


1482 


LGSLI VLSLHCKVI FSSQSLERAMKEKAVDLVP ILAQN PGLAQN 
P ILEG KDHNQNTGVD PI I DHVQDRKTD/S RSKS PHKKRS KSRER 
RKSRSRSHS RDKRKDTREKI KEKERVKEKDREKERE REKEREKE 
KERGKNKDRDKBREKDREKDKEKDREREREKEHEKDRDKEKEKE 
QDKE KEREKDRS KE I DEKRKKDKKSRTPPRS YNAS RRSRSSSRE 
RRRRRSRSSSRSPRTSKTIKRKSSRSPSPRSRNKKDKKREKERD 
HISERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
S S VS KE VDDKDAP RTE EN KI QHNGNCQ LNEENLSTKTEAV 


6159 


53 ■ 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPIjMIWPYTLPVSIjPVGSCV 
1 1 TGT P I I/TFVKD PQLE VNF YTGMDBDS D I AFQ FRLH FGH P AIM 
NSCVFGIWRYEEKCYYLPFEDGKPFELCIYVRHKEYKVMVNGQR 
I YNFAHRF PPAS VKMLQVFRDI S LTRVL I SD * GRC VR I TAVQE F 
DVSVS CDCTTAYQPG 


6160 


1626 


1790 


agakffp*f*kvadaqptesekeiynqvnwlkdaegiledlqs 
yrgagheireaiqhpadeklqekawgawplvgklkkfyefsqr 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Hietidine, I«Iaoleucine, K^Lysine, 
L= Leucine, M=»Methionine, N»Asparagine. 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, VeValine, 
W=Tryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLIiOALTSTPYSPTQHLEI^QALAKQFAEILHFTLRFD 
ELKMTNPAIQNDFSYYRRTLSRMRINNVPAEGENEVNNELANRM 
S LFYAEATPMLKTLS DATTKFVSENKNLP I ENTTDCLSTMAS VC 
RVMLETPBYRSRFTNEETVS FCLRVMVGVI I LYDKVHPVGAFAK 
TSKIDMKGCIKVLKDQPPNSVEGLLNALRYTTKHLNDETTSKQI 
KSMLQ*QLLTLVNKG 


6161 




1569 


PVSGSES SLRRAWAS I LRLMLG PRVAVS I LCEDGISH* LLEKH* 
KSHVLEPLSSLALEEQCtiALSLDWSTGKTGRAGDQPLKI I SSDS 
TGQLHLLMVNETRPRLQKVAS WQAHQFEAWI AAFN Y WH PE I VYS 
GGDDGLLRGWDTR V PG KFLFTS KRHTMG VCS I QS S PHREH ILAT 
GSYDEHILLWDTRNMKQPLADTPVQGGVWRIKWHPFHHHLLXiAA 
CMHSG FK I LNCQKAME ERQE ATVLTS HTL PDS LVYG AD WS WLLF 
RSLQRAPS WS FPSNLGTKTADLKGAS ELPTPCHECREDNDGEGH 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSL 
LATCS FYDHALHLWEWEGN 




1 


586 


RTIHATGRAGASPMHRLIVWRLAEANKQHVRCQKCLEFGHWTYE 
CTGKRKYLHRPSRTAELKKALKE KENRLLLQQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKE I ELLHS YWTDGLKTLM 


6163 


1081 


785 


R I RSTTEGCAVRLH PTQNTG KARIMI LLSVS IiGRHWAFTYKFFL 
TPWFVFFFPPFHRKE * VMQKNPMKS REDE WMEKLNNLHVQRAD 
MNRLI MNYLVTEG FKE AAE KFRME SG I EPSVDLETLDER I K I R E 
M ILKGQ X QEA I AL I NS LHP ELLDTNR Y LYFHLQQQHLI ELI RQR 
ETBAALEFAQTQLAEQGEESRECLTEMERTLALLAFDS PEES PF 
GDLLHTMQRQKWSEVNQAVLDYENRESTPKLAKLLKLLLWAQN 
ELDQKKVKYPKMTDLSKGVIEEPK 


6164 


90 


40<£ 


PCQS PGRS RMRQD KLTG S LRRGGRCL KRQGGGVGT I LSNVLKKR 
SC I S RT A PRLL CTLEPG VDTKLKFTLEP SLGQNG FQQ WYDALKA 
VARLSTG I PKEWRRKVWLTLADHYLHS I AIDWDKTMRFT FNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRWLKRVLLAYAR 
WmCTVGYCCXSFNILAALILBVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
G YEP PLTNVFTMQWFLTL FATCLPNQTVLKI WDS VF FEGS E 1 1 L 
RVSLA I WAKLGEQ I E CCETADE F YSTMGRLTQEMLENDLLQS HE 
LMQTVYS MAPFPFPQLAELREK YTYN I TPFPATVKPTS VS GRHS 
KARDSDEENDPDDEDAWNAVGCLGPFSGFLAPELQKYQKQIKE 
PNEEQSLRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQ YSRI KKKQQQQVHOVY IRADKGP VTS I LPSQVNSSPVIN 
HLLLGKKMKMTNRAAiCNAVIHI PGHTGGKI S PVPYEDLXTKLNS 
PWRTHI RVHKKNMPRTKSHPGCGDTVGL I DEQNE AS KTNGLGAA 
EAFPSGCTATAGREGS S PEGSTRRTI EGQS P E PVFGDAD VDVS A 
VQAKLGALELNQRDAAAETELRVHPP CQRHCPEP PS APEENKAT 
S KAPQGSNS KTP I FS P FPS VKPLRKS ATARNLGLYG P TE RTPTV 
H F PQMS RS FS KPGGGN S GP * KMVFS SGTMLS RQL PG YPQ E YQRN 
GGERFG 


6165 


90 


406 


PCQS PGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTI LSNVLKKR 
S C I S RTAPRLLCTLE PG VDTKL KFTLE PS LGQNGFQQW Y DALKA 
VARLSTGI PKEWRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDSMGIQ I VKDLHRTGCS S YCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCQGFNILAALX LEVMEGNEGDALK I M I YLIDKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIMAKLGEQIECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDS DEBNDPDDEDAWNAVGCLGP FSGFLAPELQKYQKQI KE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re s ponding 
to first - 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= ' 
Glutamic Acid, F=>Phenylalanine. G^Glycine, 
JUHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
wairypccpnan, i=ryrosine, X= unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNBEQSIjRSNN I ABLS PGAINSCRSE YHAAFNSMMMERMTTDIN 
ALKRQ YS R I K KKQQQQVHQVY I RADKGP VTS I L PSQVNS S P VI N 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDEQNEASKTNGLGAA 
EAF PSGCTATAGR EGSS P EGSTRRT I EGQS PE PVFGDADVD VS A 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPEENKAT 
S KAPQGSNS KT P I FS PFPS VKPLRKSATARNLGLYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP * KMVFSSGTMLSRQLPGYPQEYQRN 
GGERFG 


6166 


2 


1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEVKRVLYGKELRK 
LDL PR E AFEAAS R ED FE LQG YAFEAAE EQLRRPRI VHVGLVQNR 
IPLPANAPVAEQVSALHRRIKAIVEVAAMCGVNI ICFQEAWTMP 
FAFCTREKLPWTEFAESAEDGPTTRFCQKIiAKNHDMVWSPILE 
RDSEHGDVLWNTAWISNSGAVLGKTRKNHIPRVGDFNESTYYM 
EGNLGHPVFQTQFGRIAVNI CYGRHHPLNWLMYS INGAE 1 1 FNP 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFGYFYGSSYVAAPDSSRTPGLSRSRDGLLVAKLDL 
NLCQQVNDVWNFKMTGRYEMYARELAEAVKSNYSPTIVKE*PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEALCACEEYLSN 
LAHMD I DKDLEAPIjYIiTPEGWSLFLQRYYQWHEGAEIjRHLDTQ 

vqrcedilqqlqawpqidmegdrniwivkpgaksrgrgimcmd 
hleemlklvngnpwmkdgkwwqkyierpllifgtkfdlrqwf 
lvtdwnpltvwfyrdsyirfstqpfslknldk*aplyltpegws 
lflqryyqwhegae lrhldtqvqrcedilqqlqawpq idmeg 
drniwivkpgaksrgrgimcmdhleemliclvngnpwmkdgkwv 
vqkyierpllifgtkfdlrqwflvtdwnpltvwfyrdsyirfst 
qpfslknldk 


6168 j 


84 


1392 


vwpvpsvsamppkkqaqaggskkaeqkkkekiiedktfglknkk 
gakqqkfikavthqvkfgqqnprqvaqseaekklkkddkkkelq 
elnelfkpwaaqkiskgadpkswcaffkqgqctkgdkckfsh 
dltlerkcekrsvyidardeelekdtmdnwdekkleewnkkhg 
eaefockpktqivckhfleaiennkygwfwvcpgggdicmyrhal 
ppgfvlkkkkkkkkkedeisl*dlierersalgpnvtkitlesf 
lawkkrkrqekidkleqdmerrkadfkagkalvisgrevfefrp 
elvndddeeaddtrytqgtggde vddsvs vnd idls lyi prdvd 

killil 1 VA^L,ERFSTYTSDKDENKLSEASGGRAENGERSDIiEEDN 

eregtengaidavpvdenlftgedldeleeeuntldlee 


6163 


112 


662 - 


AP AAAMAERPE DLNLPNAV I TR 1 1 KEALPDG VN 1 S KE ARSA I S R 
AASVFVLYATSCANNFAMKGKRKTLNASDVLSAMEEMEFQRFVT 
PLKEALEAYRRBQKGKKEASEQKKKDKDKKTDSEEQDKSRDEDN 
uauaai^unjCtCtCt\iriCiLiLiL*vua IvUKJbl VArWlWrljEMRRATCFCE 

AFPCWAE 


6170 


62 


667 


STKVMLPNTGRLAGCTVF ITGASRGIGKAI ALKAAKDGAN I VI A 
AKTAQPHPKLLGTI YTAAEE I EAVGGKALPCI VDVRDEQQISAA 
VEKAI KKFGG IDI L VNNAS AI S LTNTLDTPTKRLDLMMNVNTRG 
TYLASKACXPYLKKSKVAHIPNISPPLNLNPVWFKQHCGRW-*W 
G * GDGLCL I CFE LNLCMSD V I TI CT 


6171 


362 


941 


HFMQSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAYF PRVLKQVHQALSLSQEAVSVMDSMVRDILD 
RIATEAGHLAHYSKCVTITSRDIRMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHR FSRTHVEAALKMLRREARLRRE YLYRKARE EAQR 
3AQERKERLRRALEENRLIPTELRREALALQGSLEFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 

H=Histidinf s IaTRnlanrino V — T \r<~ A ri<* 

L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








QRMNRGRHEVGALVRACKANGVTDLLWHEHRGTPVGLIVSHLP 
tur * v vruxjiu.Lt'UiAs lNoliiVfvFHljiTHGr SSRIjGKRV 
SDILRYLFPVPKDDSHRVITFANQDDYISFRHHVYKKTDHRNVE 
LTEVGPRFELKLYMIRLGTLEQEATADVKWRWHPYTNTARKRVF 
LSTE * AAPRPLGQLL 


*173 


3 


288 


svdhrevqvlsqsmpltphqavlrgerpymcvecgkcfgrsshl 

uynyKin i t»cK.t»x VLoVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
R I HTGERP YVC PLOG KAFNHST VLRSHQRVHTGE KPHRCNECGK 
T FS VKRTL LQHQR I HTGEKP YTCS E CG KA FS DRS VL I QHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHLIQHQiCVHRKL*PTCVLSVGSALAGVPTSFSISVSTLERSP 
MCAVYVGR PSARAQS LVNTGQFTQVRS PMS VMS VEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLALAPLGDAQLVLLRPRRLMNANGRSVARAAELFGL 
TAEE V YLVHDELDKP LGR LALKLGGSARGHNG VRS CI S CLNSNA 
KPRLRVGIGRPAHPEAVQAHVLGCFSPAEQELLPLLLDRATDLI 
LDHIRERSQGPSLGP*H*WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGU3AFAEGPLRAMAAPVKGNRKQSTEG" 

DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 

AFLVS LYKFMKERHTP I ERVPHLGFKQINLWKI Y KAVEKLGAYE 

LVTGRRLWKNVYNELGGSPGSTSGATCTRRHY*RLVLPYVRHLK 

GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 

MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 

PEAYKRLLSSFYCKGTHGIMSPLAKKKLLAQVSKVEALQCQEEG 

CRHGAEPQASPAVHLPESPQSPKGLTENSRHRLTPQEGLQAPGG 

SLREEAQAGPCPAAP I FKGCFYTHPTEVLKPVSQHPRDFFSRLK 

DGVLliGPPGKEGLSVKEPQLVWGGDANRPSAFHKGGSRKGILYP 

KPKACWVSPMAKVPAESPTLPPTFPSSPGLGSKRSLEEEGAAHS 

G KRLRAVS P FLKEADAKKCG AKP AGS G LVS CLLG P ALG P VP PEA 

x rvvjr i riunctriuN r lui foir LtlfXaljAAdjy r S PL V I PAFPAHFZ*ATAG 

PS PMAAGLMHF PPTS FDS ALRHRLCPAS SAWHAP PVTTYAAPHF 

FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIGQIIGASGFSESSLFCKWGIHTGAAWKLLS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDS FGRCQLAG YG FCH VP S S PGTHQLACPTWR PLGS WREQLAR 
AFVGGGPOLiTjHRnTTYSRAnovDT.irraanr'nruT prrr t t mton 

RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN ' 
CSHCYLDQI KRSDFLGFSGYS PHFVAISTNSEHKMQPSSMQQAL 
PS Q * P Y WTDPR PALV PCCSH R PD VHR <? R pr on t . vkt c r rc no n o 
VCPI 


6178 


1027 


254 


STQRGG I KG VARAAS L VGRRRAGTGMALLiLCL VC LTAALAHG CL 
HCHSNFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 
LHLA I PAKI TREKLDQVATAVYQMMDQL YQG KM Y FPG YFPWELR 
NIFREQVHLIQNAIIESR1DCQHRCGIFQYETISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVI,PALRCLEPPHLANLSLEDAA*CLKQH 


6179 


806 


276 


RGETREMAGNLliSGAGRRLWDWVPLACRS FSLG VPRL 1GI RLTL 
PPPKWDRWNEKRAMFGVYDNIGILGNFEKHPKELIRGPIWLRG 
W KGNE LQR C I RKR KM VGSRMFADDLHNLN KR I RY L YKH FNRHG K 
FR*KRKLRTSEKAHLSPWRRETVLFPVRKRI*CIFSVIKWGFFGI 


6180 


156 


1833 


DHH I LKAAS TTHVCARGN I FAI PNTRCLE C * ATAT P S S L.ECQN * 
SHLSLCPLPArrSGLTPNSMIPEKERQNIAERLLRVMCADLGAL 
S WSGKE PLKLAQTLVDSGAR YGAFSVTEII/SNFNTLAIjKHLPR 
MYNQ VKVKVTCALG S NACLG I GVTCHS QS VG PDSCY I LTA YQAE 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TaThreonine , V^Valine, 
n-irypcopnan, x-ryrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








GNHIKSYVLGVKGADIRDSGDLVHHWVONVLSEFVMSEIRTVYV " 

TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 

VIEbLNVCEDLAGSTGLAKETFGSLEETSPPPCWNSVTDSLLLV 

HERYEQICEFYSRAKKMNLIQSIiNKHLLSNLAAILTPVKQAVIE 

iibN£S>gE> 1 LiQIiVruPTYVRLEKLFTAKANDAGTVSKLCHLFIjEAL 

KENFKVHPAHfCVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV 

KESWAEEADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 

PLFQATPDLFQYWSCVTQKHTKLAKLAFWLIiAVPAVGARSGCVN 

M CEQALL I KRRRLLS PEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRpMG'PtAtPAWLQPRYRKNAYLFI " 
YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSLLEIXH I YVGI ESNHLLPRFLQLTER I 1 ILFWITSQEEVQE 
KYWCVLF VFWNLLDMVRYTYSMLS VIGI S YAVLTWLSQTLWM P 
IYPLCVtAEAFAIYQSLPYFESFGTYSTKLPFDLSIYFPYVLKI 
YIjMMLFIGM YFTYS H L YS ERRO I IiG I FP I XKFCKM* S TA FQCDTR 
KDRLW1QCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS* IDYQLNTLLKEFQLTEENTKLRYLTdSLIEDMAAAYFPDCI " 

VRP FG SS VNTFGKLG CDLDM FIiDLDETRNLS AH K I SGN FLME FQ 

VKNVPSERIATQKILSVLGECLDIIFGPGCVGVQKILNARCPLVR 

FSHQASGFQCDLTTNNRIALTSSELLYIYGALDSRVRALVFSVR 

CWARAHSIiTSSIPGAWITNFSliTMMVIFFLQRRSPPILPTLDSL 

KTLADAEDKCVIEGNNCTFVRDLSRIKPSQNTETLEHibKEFFE 

YFGNFAFDKNS INI RQGREQNKPDSS PLY1 QNP FETSLN IS KNV 

SQSQLQKFVDLARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 

APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 

STQT 


6183 


1118 


4S2 


HLDRYIKSPGSGSSTPAPPSHHiLYbLHPQSTRTMGCCGCSRGC 
GSGCGGCGS3CGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCLVKAFLM 
VP 


6184 


1 


2191 


IVTVREEDGAPAVAPPGWVSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWR I ITS 
EliYRSLGDVLRDVDAKALVRSDFLLVYGDVISNINITRALEEHR 
LRRKL* KNVSVMTMI FKESS PS H PTRCHE DNVWAVDS TTNR VL 
nrwiviyvjuKKfAr rijblirU^^^lXjVfcVRYDLLDCHISICSPQVA 
' QLFTDN FD YQTRD DF VRGLL VNEE I LGNQ I HMHVTAKE YGAR VS 
NLHMYSAVCADVI RRWVYPLTPEANFTDSTTQSCTHSRHNI YRG 
PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NWLDQrYLWO^VRVAAGAQIHQSLLCDNAEVKERVTLKPRSVb 

TSQWVGPNITLPEGSVISLHPPDAEEDEDDGEFSDDSGADQEK 

nfO/FfMIfnvMPa RV^aar vrvr uyn t\ r* mvtmc urcrT >%amt t.ir+t -r 
ur. vrvrJAO r W tr/ict v la/iHtjRij i ljWKAAvjnwMEEEEEIjQ^NLWGLKI 

NMEEE SESESEQSMDSEEPDS RGGS PQM DD I KVFQNEVLGTLQR 

GKEENISCDNLVLEINSLKYAYNISLKEVMQVLSHWLEFPLQQ 

MDSPLDSSRYCALLLPLLKAWS PVFRN YIKRAADHLEALAAI ED 

FFLEHEALGISMAKVLMAFYQLEILAEETILSWFSQRDTTDKGQ 

QLRKNQQLQRFIQWLKEAEEESSEDD 


6185 " 


791 


44 


P CTS CVL W ATLHL P ASTRKAP QAECGM ISITEWQKI GVG I TG FG " 
IFFILFGTLLYFDSVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 
HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFPSLFKGFFPVAF 
GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 
REEWEP PPQS PALTHS PTYPGPPQVQKERNGAEQLTSNPQVDS R 
GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 


569 


238 


VYGIDSSNTNTHGAEERNRKLKKHWKLCHAQSRLDVNGLALKMA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A-Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
lUHistidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
ScSerine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








KE RKVKN KVKNKADTE EVFNNS PTNQE KMPTS AI LPD FSGS VT S 
NIRNQMETLHSQPHQEENLCFENSFSLINLLPINAVEPTSSQQI 
PNSETSEANKERRKMTSKSSESNIYSPLTSFTTADSELHDII1CD 
LE DCLM VG LHTCG DLA PNTLR I FTSNS E I KG VCS VG CCYHLLS E 
EFENQHKERTQEKWGFPMCHYLKEERWCCGRNARMSACLALERV 
AAGQGLPTESLFYRAVLQDIIKDCYGITKCDRHVGKIYSKCSSF 
LDYVRRS LKKLGLDES KLPE KI IMNYYEKYKPRMNELEAFNMLK 
VVLAPCIETLILLDRLCYLKEQEDIAWSALVKLFDPVKSPRCYA 
VIALKKQQ* FPLKQI IRCISL*DSAGCAEEVSVGDGGPALRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMEPKASCPA^ 
AAPLMERKFHVLVGVTGSVAAIiKLPLLVSKLLDIPGLEVAVVTT 
E RAKHFYS PQD I P VTL YS DADE WEMWKS RS DPVLH I DLRRWADL 
LLVAPLDANTLGKVAS G I CDNL LTCVMRAWDR3 KPLL FCP AMNT 
AMWEHP I TAQQVDQLKAFGYVE I PCVAKKIiVCGDEGIjGAMAEVG 
TI VDKVKEVLFQHSGFQQS* PGISV?^GVPLYSEWVQAKSVKMDV 

GKIGGYPHLLNGGPALSLPRGQACSRLNWTEGPGLSFFQPGEAA 
A 


6188 


238 


1534 


KGFVNAGPIJ4AEI^VSPQWKAPEMSQICLSCX5HPSA*GPRWASW 
NIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCMQEMG 
NG KANR L YEA YLPETFRR PQI DPAVEGFIRDKYE KKKYMDRSLD 
INAFRXEKDDKWKRGSEPVPEKKLEPWFEKVKMPQKKEDPQLP 
RKS S PKSTAP VMDLLGLDAPVACS IANSKTSNTLEKDLDLLASV 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QhS KDS I LSLYGSQTPQMPTQAMFMAPAQMAYPTAYPS FPGVTP 
PNSIMGSMMPPPVGMVAQPGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMTTQQAGYl^GMAAMPQTVYGVQPAQQLQWNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNBQAANQTLSPQMWK 


6189 


1297 


793 


LGEPLGDLCELIPaDVQQLQMGEVHPGTOAQGSAAQSVAGEVQL 
TQLSHARQRPSCQGSQMALDLQHMDISRQPRWQHVOPVARQVO 
RAQQAQLAE G VAVHLWAGDAWAEVELLQE VGGG KVFAANACDL 
WQDHEG^HAARQATGHALQRVIVQVRRVQPLEAlj*RVPSGLPR 
RVRAFMI LHNQ I TG IGRED FATT Y FLS ELNLS YNR I TS PQVHRD 
AFRKLRI^RSLDLSGNRliHMLPPGLPRJ^VHVLKVKRNEIAALAR 
GALAGMAQLRELYLTSNRLRSRALGPRAWVDLAHLQLLDIAGNQ 
LTE I PEGLPESLEYLYLQNNKISAVPANAFDSTPNLKGI FLRFN 
KLA VG S VVDS AFRRL KHLQ VLDI EGN LEFGDIS KDRGRLGKEKE 
EEEEDEVEEEETR 


6190 


66 


1309 


I LVGNVSFLLSFAEYVCNCS WGSLNVNRCNQTTGQCECH PGYQ~~ 
GLHCETCKEGFYLNYTSGLCQPCDCSPHGALSIPCNSSGKCQCK 
VGVI GS I CDRCQDG YYGFS KNGC LP CQCNNR SASCDALTGACLN 
CQENSKGNHCEECKEGFYQSPDATKECLRCPCSAVTSTGSCSIK 
SSELE PECDQCKDG YIGPNCNKCENGYYNFDS I CRKCQCHGHVY 
PVKTPKICKPESGECINCLHNTTGFWCENCL*GYVHDLEGWCIK 

KVILPTPEGST T LV < ?NAQT.TTC\7DTDVTMeTI7T"0 , TWT'T rvr -r n-nt 

TSENSTSALADVS WTQFNI IILTVI 1 1 VWLLMG FVG AVYM YRE 
YQNRKLNAPFWTIELKEDNISFSSYHDS I PNADVSGLLEDDGNE 
VA PNGQLTLTTP I HNYKA 


6191 ~ 


1212 


1511 


VNLCHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
MIDWIKKIWYIYTMEYYATIKRNEIMFFAGTWMEMEAIILSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE^ 
XGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQBLVAS FSERVRNMS PDE I KI P PEP PG 
RCSNHLQDKIQKLYERKIKEGMDMNYI IQRKKEFRNPSI YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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S3Q 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G^Glycine, 
H=Histidinc, I«Isoleucine, K»Lysine, 
L= Leucine, ^-Methionine , N=Asparagine, 
P=Proline, Q=Giutamine, RsArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine. X^CJnknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKIEFV1X3TKKGTTTNATSTTTTTASTAVADAQKRKSKW 
D S AI PVTT I AQPT I LTTTATLPAWTVTTS AS G S KTT V I S AVGT 
IVKKAKQ 


6193 


3 


950 


TRGOSNKMAGKKNVLSSIAVYAEDSSPESDGEAGIEAVGSAAEE " 
KGGLVSDAYGEDD FSRLGGDE DG YE3EEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
R CSNHLQDKI Q KL YER K I K5GMDMN Y I IQRKKE FRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEAIiAKAQKIEMDKLEK 
AKKERTKIEFVTGTKKGTTTNAT5TTTTTASTAVADA0KRKSKW 
DSAI PVTTI AQPTILTTTATLPAWTVTTSASGS KTTV ISAVGT 
IVKKAKQ 


6194 


3 


950 


TRG CGNKMAG KKNVLS S LA VYAE DS E PESDG EAG I E AVGS AAE E 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PBADDPKDNTEAEKRDPQELVASFSERVRNMS PDEIKI PPEPPG 
RCSNHLQDKI QKLYERKI KEGMDMNY I IQRKKEFRNPS I YEKLI 
QFCAIDEUX5TbTYPKDMFDPHGWSEDSYYEAIiAKAQKIEMDKLEK 
AKKERTK I E FVTGTK KGTTTNATS TTTTTASTAVADAQKRKS KW 

DSAIPVTTIAQPTILTTTATLPAVVTVTTSASGSKTTVISAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDSPSVRKTHCSGRKHKENVKD " 
YYQKWMEEQAQSLIDKTTAAFQQGKIPPTPFSAPPPAGAMIPPP 
PSLPG P PR PGMM PAPHMGG P PMMPMMGP P PPGMM P VGPAPGMR P 
PMGGHMPMMPGP PMMRP PAR PMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILDNAEQVISNLEARNL3PRLTPLLQEEDSH' 
QRLLMGLMVS ELKDHFLRH LQGVE KKKI EQMVLDYI S KLLDLI C 
HI VBTNWRKKNLHS WVLHFNSRGSAAEFAVFHIMTR I LE ATNS L 
FLPLPPGFHTLHTILGVQCLPLHNLLHCIDSGVLLLTETAVIRL 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDHPMSSNIISR 
NHVTRLLQNYKKQPRNSMINKSSFSVEFLPLNYFIEILTDIESS 
NQALYPFEGHDNVDAEFVEEAALKHTAMLLGL 


6197 


3 


819 


ADPEGTE2AVMSRYTRPPWTSLFIRNVADATRPEDLRREFGRYG 
PIVDVYIPLDFYTRRPRGFAYVQFEDVRDAEDALYNLNRKWVCG 
RQIEIQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 
SRTPRRN?GSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPSFISPACFLLRKLPALEDGTLPHPD1-LGMNYEGARSE 
RENHAADDS EGGALDMCTSERt.PGLPQP I VMEALDEAEGLQDSQ 
REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHIMSQNATNLVSSLLTLLKQLEPTAWLDSGTV; 
GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEPPTPLPPEDRRQSV 
SRQPSFTYSEWMEEKIEDDFLDLDPVPETPVFDCVMDIKPEADP 
TS LT VKSMGLQERRG S NVSLTLDM CTPG CNEEG FG YLMS PREES 
AREYLLSASRVLQAEELHEKALDPFLLQAEFFEIPMNFVDPKEY 
DI PGLVRKNRYKTI LPNPHSRVCLTS PDPDD PLSS YINANYIRG 
YGGEEKVYIATQGPIVSTVADFWRMVWQEHTPIIVKITNIEEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTBER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAGIGRTGCF I ATS I CCQQLRQEGWD I LKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQSPE 


"8199 


144 


1211 


MARENGESSSS WKKQAEDI KK I FEFKETLGTGA FS EWLAEEKA 
TGKLFAVKCIPKKALKGKESSIENEIAVLRKIKHENIVALEDIY 
BSPNHLYLVMQLVSGGELFDRIVEKGFYTEKDASTLIRQVLDAV 
YYLHRKGIVHRDLKPENLLYYSQDEESKIMISDFGLSKMBGKGD 
VM S TACGTPG YVAPE VLAQKP YS KAVO CWS I G VI AYT LLCG YP P 



464 



WO 01/53312 



PCT/US00/34263 



SEQ 
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location 
corresponding 
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Amino acid segment containing signal peptide 
(A-Alanine, (^Cysteine, D=Aspartic Acid, Es= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine , V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FyDENDSKLxFEQILKAEYEFDSPYWDDISDSAKDFIRNLMEKOP 
NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 
AFNATAVVRH^KliHLGSSLDSSN74SVSSSI^LASQ10X»SGTp 
HAL* 


6200 


702 


96 


iiPEVPHSLRPRVKPHLCCAQPAVRVMARbPKIiAVFDLDYTLWPF 
WDTHVDPPFHKSSDGTV1U)RRGQDVRLYPEVPEVLKRLQSLGV 
PGAAAS RTS E I EGANQ LLBLFDL FR YFVH R E I YPGSKI THFERL 
QQKTG I P FSQMI FFDDERRNI VDVSKLGVTCIHIQNGMNLQTLS 
QGLETFAKAQTGPljRSSLEBSPFEA 


6201 


2809 


2383 


GQT PR VR W KMRRS LRAG KRRQTAGRKS KS P P K VP I V I QDDS L PA 
GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSLAQREAEYAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDbTHPVE 
DGIFDSGNFEQFLREKVKVKGKTGNLGNVVHIERFKNKITVVSE 
KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DKU2SESED 


" 6203 


419 


2550 


RCPRPPATAGAAASRPDRSPPSGISGSEAAAGAGAAAPASQHPA~" 

TGTGAVQTEAMKQILGVIDKKLRWLEKKKGKLDDYQERMNKGER 

LNQDQLDAVS KYQEVTNNLEFAKELQRSFMALSQDI QKTI KKTA 

RREQUvJREEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 

PIIiSEEELSLLDEFYKLVPPERDMSLRIiNEQYEHASIHIiWDLLE 

G KE KPVCGTTYKVLKE I VE RV FQSN YFDSTHNHQNGLCE E EE AA 

SAPAVEDQVPEAEPEPAEEYTEQSEVESTEYVNRQFMAETQFTS 

GEKEQVDEWTVETVE WNS LQQQPQAASPS VPE PHSLT P VAQAD 

PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTLDPA1VSAQPM 

NPTQNMDMPQLVCPPVHSESRLAQPNQVPVQPEATQVPLVSSTS 

EGYTASQPLYQPSHATEQRPQKEPIDQIQATISLNTDQTTASSS 

LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFK-MNAPVP 

PVNEPETLKQONOYQASYNQSFSSQPHQVEQTELOQEQIiQTWG 

TYHGS PDQSHQ VTGNHQQ PPQQNTG F PRS NQ P YYNS RG VS RGGS 

RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 

RDYSGYQRDGYQQNFKRGSGQSGPRGAPRGRGGPPRPNRGMPQM 
NTQQVN 


6204 


2933 


787 


CTHNL I SLLGGRALIHFNRFLNLK I QEGEAHNI FCPAYDCFQL V 

PGD1IKSWSKEMDKRYLQFDIKAFVENNPAIKWCPTPGCDRAV 

RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWBCLGEAHEP 

CDCQTWKNWLOKITEMKPEELVGVSEAYEDAANCLWLLTNSKPC 

ANCKSPIQKNEGCNHMQCAKCKYDFCWICLEEWKKHSFVHWEVI 

YRCTRYEVIQHVEEQSKEMTVEAEKFCHKRFQELDRFMHYYTRFK 

NHEHSYQLEQRLLKTAKEKMEQLSRALKETEGGCPDTTFIEDAV 

HVLLKTRRILKCSYPYGPFLEPKSTKKEIFELMQTDLEMVTEDL 

AQKVNRPYLRTPRHKI IKAACLVQQKRQEFLASVARGVAPADSP 

EAPRRSFAGGTWDWEYLGFASPEEYAEFQYRRRHRORRRGDVHS 

LLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 

SLRDYTPASRSENQDSDQALSSLDEDDPNILLAIQLSLQESGLA 

LDEETRDFLSNEASLGAIGTSLPSRLDSVPRNTDSPRAALSSSE 

LLELGDSLMRLGAENDPFSTDTLSSKPLSEARSDFCPSSSDPDS 

AGQDPN IKDNLLGNIMAWFHDMNPQS IALI PPATTE ISADSQLP 

CI KDGSEGVKDVELVLPEDSMFEDA5 VSEGRGTQI EENPLEBNI 

PGGGKQHPQAW 


6205 


1 


1200 


RAHRG KMALE VGDMEDGQLS DS DS DKTVA PSDR PLQL P KVLGGD " 
SAMRAFQNTATACAPVSHYRAVESVDSSEESFSDSDDDSCLWKR 
KRQKCFNP P P KP E P FQFGQS SQ KP P VAGGKK I NNI WGAVLQEQN 
QDAVATELGILGMEGTIDRSRQSETYNYLLAKKLRKESQEHTKD 
LDKELDE YMHGG KKMGS KE E ENGOGHLKRKR P VKDRLGNRPEMN 
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Ammo acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








YKGRYSITAE^SQEKVAI)EIS?RW)EPKKDLIAJ?\A^iiGNKjCA 
1 3 LLME TAE VEQNGGLFI MNGS RRRT PGGVFLNLL KNTPS I S E B 
QI KDI FY I ENOKEYENKKAARKRRTQVLGKKMKQAI KSLNFQED 

DDTSRETFASDTNEJUjASLDESQEGHAEAKLEAEEAIEVDHSHD 
LDIF 


6206 


10 


1442 


iiserrersclhlvcircscdwemgsvlglcsmaswipclcgs 

APCLLCRCCP SGNNS TVTRL I YALFL L VGVC VAC VM L I PG MEEQ 

lnkipgfcenekgwpcnilvgykavyrlcfglamfylllsllm 

IKVKSSSDPRAAVHNGFWFFKFAAAIAIIIGAFFIPEGTFTTVT^ 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVBKMEEGNSRCWYA 
ALLSATALNYLLSLVAIVLFFVYYTHPASCSENKAFISVNMLLC 
VG AS VMS I L P KIQE SQ PRS GLLQS S V I TVYTMYLTWS AMTNE P E 
TNCNPSLLSIIGYNTTSTVPKEGQSVQWWHAQGIIGLILFLLCV 
FYSSIRTSNNSQVNKLTL.TSDESTLIEDGGARSDGSbEDGDDVH 
RAVDNERDGVTYSYSFFH^FIASLYIMMTLTKWYRYEPSREM 
KS QWTAVWV KISSSWIG I V JLY VWT LVAP LVLTNRDFD 


6207 


2924 


1471 


VMAEAAT P G TT ATT S GAGAAAATAAAAS PT P I P TVT APS LG AG 
GGGGGS DGS GGG WTKQ VTCR Y FMHG VCKEGDNCR YS H DLSDS P Y 
S WCKYFQRG YCI YGDRCRYEHS KPLXQEEATATELTTKSSliAA 
S S S LSS I VGPLVEMNTG EAES RNS NFATVGAGS EDW VNAI EF VP 
GQPYCGRTAPSCTEAPLQGSVTKESSEKEQTAVETKKQLCPYAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQHIKSCIEA 
HEKDMELSFAVQRSKDMVCGICMEWYEKANPSERRFGILSNCN 
HTY CLKC I R KWRS AKQ FES K 1 1 KS C PECRI TSNF VI P S E YWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYXHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIEERENSNPFDNDEEE 
WTF3LGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6208 


2924 


1471 


T VMAEAATPGTTATTSGAGAAAATAAAAS PT P I PTVTAPS LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
S WCKYFQRG YC I YGDRCRYEHS K P LKQ EE ATATELTT KS SLAA 
S SSLSS IVG PLVEMNTGEAESRNSNFATVGAGS EDW VNAIEFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGECR YGENCVYLHGDS CDMCGLQ VLHPMDAAQRSQHI KS CI EA 
HEKDMELSFAVQRSKDMVCX3IC>1EVVYEKANPSERRFGILSNCN 
HTYCLKCIRKWRSAKQFES KI I KSCPECRITSNFVI PSEYWVEE 
KEEKQKLIIiKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIEERENSNPFDNDEEB 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ER LC F PCMQS KI Y S YMS PN KCSGMR F PLQE ENS VTHH E VKCQGK 
PLAG I YRKREE KRNAGNAVRS AM KS EE QKI KDARKGP LVP FPNQ 
KSBAAEPPKTPPSSCDSTNAAlAKQAIiKKPIKGKQAPRKKAQGK 
TQ QNR KLTDF YP VRRS S R KS KAE LQS E E RXR I DE L I ESG KE EGM 

KIDLIDGKGRGVIATKQFSRGDFWEYHGDLIEITDAKKREALY 
AQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 
TKLH D I DG VPHL I LI AS RD I AAG E E LL YDYGDRS KAS I E AH PW L 
KH 


6210 


3761 


387 


I FGMS KLRMVLLEDSGSADFRRHF VNLS PFTI T WLLLSACFVT 
SSLGGTDKELRLVMENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDHVSCRGNESALWD 
CKHDGWGfQXSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
IKFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHCGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTBCSGRLEVRPQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLBLRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTSYQVYSKIQATNTWL 
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Amxno acid segment containing signal oeptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E* 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FJjSS CNGN ETSLWDC KN WQ WGGLTCDH Y K BAK I TCS AHRE PRLV 
GGDIPCSGRVEVKHGDTWGS1CDSDFSLEAASVLCRELQCGTW 
SILGGAHPGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVGWCSRYTEIRLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
j.cutt«vi J uyyuNU\jVAJj5J reGARFXSKGNGQIWRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAHVVCRQLGCGEAINATGSAHFGEGTX3PIWLDEKKCN 
G KES R I WQCHS HG WGQQNCRH KE DAGV I CS E FMS LRLTS E AS RE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGVVCRQLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VGI LGWLLAI FVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNSCLNADDLDLMNS SGGHSEPH 


6211 

- 


3761 


387 


I FGMS KLRM VLLEDSGS ADFRRHF VNLS PFT 1 T WLLLSACF VT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S V I CNQLG C PTAI KAPG WANS SAG SGR I WMDIIVSCRGNE SALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRIiTRGGNMCSGRIE 
I KFQGRWGTVCDDNFNI DHASVICRQLECGS AVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVNAS KGFGHI WLDSVS CQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADWCRQLGCGSALKTSYQVYSKIQATNTWL 
FLS S CNGNETSLWDCKNWQWGGLTCDHYEEAK I TCS AHRE PRLV 
GGDIPCSGRVEVKHGDTWGSICDSDFSIjEAASVLCRELQCGTVV 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 
SRDVGWCSRYTEIRIjVNGKTPCEGRVELKTLGAWGSLCNSHWD 

IEDAH VT ifOriT ,\C C*(Z V£ r.CfOrZf^ A o Pr" vr"» Mr»ri t urn 1 1 »jt -rrr i nrvi mn 
iiiwxvti v4jv.^yun.u.iavMijo i ruuHKrLiittaNGOIWRHMFHCTGTEQ 

HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DS WDLSDAHWCRQLGCGEAINATGS AH FGEGTG P I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCROLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDX^vnVTDn VATTttD c corie cpta 

VG I LGWLLAI FVALFFLTK KRRQRQRLAVS SRGENLVHQ I Q YR 
EMNS CLNADDLDLMN S SGGH SE PH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTNGFHMTWS VKLDEH 1 1 PLGSMAI NS I 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRALYMHFLT 
DTFTAY YRLLITHLGLPQWQYAFTSYGI S PQAKQRVSMYKP I TY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGP PSS LRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTNGFHMTWS VKLDEH I I PLGS MA INS I 
S KLTQLTQSS MYSL PNAPTLADLEDDTH EASDDQ PE KP H FDS RS 



467 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nuel pnf- 4 Ho 

location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
resiauc oi 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, VeValine, 
W^Tryptophan, Y*Tyrosine, X=Unknown, +aStop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








VI FELDS CNG SG KVCLi VY ICS GKPALAEDTE I W FLDRALY WHFLT " 
DTFTAYYRLLITHIiGLPQWQYAFTSYGISPOAKQRVSMYKPITY 
NTNLLTEETDS FVNKLDPSKVFKS KNKXVI PKKKGPVQPAGGQ K 
GPSGPSGPS7SSTSKSSSGSGNPTRK 


6214 


2 


460 


HBIiAPS AI RRAARLGLGP ARWQSRAAAFVF VRGFRTGWS FVGWV 
VLGTS AKRTRL FFFLS KMAAS SRAQVIAL YRAMLRES KR FSAYN 
YRTYAVRR IRDAFRENKNVKDPVB I QTLVNKAKRDLGVTRRQVH 
IGQLYSTDKLI I ENRDMPRT 


6215 


2 


1849 


FVAGG PRGSGSAAETMPE I RVTPLGAGQDVGRSC I LVS I AGKNV 
MLDCGMHMGFWDDRR F PD FS Y I TQNG R r .TD F LDCVI I S H FHLDH 
CGALPYFSEMVGYDGPIYMTHPTQAICPILLEDYRKIAVDKKGE 
ANFFTSQMI KDCMKKWAVH LHQTVQ VDDELE I KAYYAGHVLGA 
AMFQ I KVGS ES VVYTGDYNMTPDRHLGAAW I D KCR PNLL I TEST 
YATTI RDSKRCRERDFLKKVHETVERGGKVLI PVFALGRAQELC 
ILLETFWBRMNLKVPIYFSTGLTEKANHYYKLFIPWTNQKIRECT 
FVQR NM FEF KH I KA FDRAFADNPG PM WFAT PGM LHAGQS LQ I F 
RKWAGNEKNMVIMPGYCVO^TVGHKILSGQRKLEMEGRQVLEVK 
MQVB YMSFSAHADAKGIMQLVGQAE P ESVLLVHGEAKKME FLKQ 
KIEQELRVNCYMPANGETVTLPTSPSIPVGISLGLLKREMAQGL 
LPEAKKPRLLHGTLIMKDSNFRLVSSEQALKELGLAEHQLRFTC 
RVHLHDTRKBQETALRVYSHLKSVLKDHCVQHLPDGSVTVESVL 
LQAAAPSEDPGTKVLLVSMTYQDEELGSFliTSLLKKGLPQAPS 


6216 


11 


393 


QTTR P E PRNSAIjRQSRS KMA WGVS S VS RLLGRS R PQLGR PMS S 
GAHGEEGSARMWKTLTFFVALPGVAVSMLNVYLKSHHGEHERPE 
FIAYPHLRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 




9 


1178 


TR VG RGE SGJbKME VKPPPGRPQPDSGRR RRRRGE EGHDP KE P EQ" 
LRKLFIGGLSFETTDDSLREHFEKWGTLTDCWMRDPQTKRSRG 
FGFVTYSCVEEVDAAMCARPHKVDGRWEPKRAVSREDSVKPGA 
RTjTVKKIFVGGIKEDTEEYNLRDYFEKYGKIETIEVMEDRQSGK 
KRG PAFVTFDDHDTVDK I WQKYHTI NGHNCEVKKALS KQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGS YGGGDGG YNGFGGDGGN YGGG PGYS S RGG YGGGG PG YG 
NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGQQQS 
NYGPMKGGSFGGRS^GSPYGGGYGSGGGSGGYGSRRF 


ezia 


1305 


906 


S CERRGFIMADDLXRFLYKXL PS VEGLHAI WSDRDG VPVI KVA 
NDNAPEKALRPGFLSTFALATDQGSKLGLSKNKSI ICYYNTYQV 
VQFNRIiPLWSFIASSSANTGLIVSLEKELAPLFEELRQVVEVS 


6219 


* 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPBADRPHQRPFL 
IGVSGGTASGKSTVCEKIMELLGQNEVEQRQRKWILSQDRFYK 
VLTABQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
T YDFVTHS R LP ETTWYPAD WLF EG I LVFYSQE I RDM FHLRLF 
VDTDSDVRLSRRVLRDVRRGRDLEQILTQYTTFVKPAFEEFCLP 
TKKYADVIIPRGVDNMVAINLIVQHIQDILNGDICKWHRGQSNG 
RSYKRTFSEPGDHPGMLTSGKR5HLESSSRPH 


6220 


227 


764 


EQN IS LEWS CT I EKALAO AKAL VER LRDHDD AAESL I EQTTALN 
KRVEAMKQYUEEIQEbNEVARHRPRSTLVMGIQQENRQIRELQQ 
ENKELRTS LEEHQSALELIMS KYREQMFRLLMAS KKDDPGI IMK 
LKEQHS KI DMVHRNKSEGFFLDASRHI LEAPQHGLERRHLEANQ 
NVH 


6221 


. 98 


916 


RWIWDLNPVSDGLBLRPKYNGILHCLTTIMKLDGLRGLYQGVTP 
NIWGAGLSWGLYFVFYNAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLCI TN PLWVT KTR LM LQYDAWNS PH RQ Y KGM F DTLVK I Y K 
YEGVRGLY KGFVPGLFGTSHGALQFMAYELLKLKYKQHI NRLP E 
AQLSTVEYISVAALSKIFAVAATYPYQWRARLQDQHMFYSGVI 
DVITKTWRKEGVGGFYKGIAPNLIRVTPACCITFWYENVSHFL 
LDLREKRK 
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Amxno acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D-Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
s *Ser ine, T»Threonine, v=Valine, 
W tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6222 


2 


2116 


rWELRALLLWGRRLRPIaLRAPALAAVPGGKPILCPRHTTAQLG 
PRRN PAW S LQ AG RIj FS TQTAED KEEPLHS I ISSTESVQGSTSKH 
EFQAETKXXiLDI VARS LYSE KE VFI R EL ISNASDALEKLRHKL V 
SDGQALPEMEIHLQTNAEKGTITIQDTGIGMTQEELVSNLGTIA 
RSGSKAFLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVYSR 
S AAPGS LG YQ WLS DGS G VFE I AEASG VRTGTKI 1 1 H L KSDCKE F 
SSEARVRnvVTKYSNFVSFPLYLNGRRMNTLQAIWMMDPKDVRE 
WQHEE FYR YVAQAHDKPRYTLHYKTDAPLNIRS I FYVPDMKPSM 
FDVSRELGSSVALYSRKVLIQTKATDILPKNLRFIRGWDSEDI 
PLNLSRELLOESALIRKLRDVLQQRIjIKFFIDQSKKDAEKYAKF 
FEDYGLFWR3GIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
EYASRMRAGTRNIYYLCAPNRHLAEHSPYYEAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CLSEKETEELMAWMRNVU3SRVTNVKVTLRLDTHPAMVTVLEMG 
AARHFXRMQQLAKTQEERAQLLQPTLEINPRHALIKKLNQLRAS 
E PGLAQLLVDQI YENAM I AAG LVDDPRAMVGR LNELLVKALERH 


6223 
: 6224 


3 
1 


715 
133 


DAWARTMAGMVDFQDEEQVKSFLEKMEVECNYHCYHEKDPDGCY 
RLVDYIiEGI RKNFDEAAKVLKFNCEENQHSDS CY KLGAYYVTGX 
GGLTQDLKAAARC FLMACEKPGKKS I AACHNVGLLARDGQVNED 
GK5PDI/3KARDYYTRACDGGYTSSCFNLSAN1FLQGAPGFPKDMDL 

ACKYSMKACDIXSHIWACANASRMYKXiGDGVDKVEAKAEVLKNRA 
QQVHKEQQ KG VQPLTFG 


6225 


3259 


938 


LRTISSMAWGPLLLTLLAHCTGSWAQSVLTQPPSVSGARIPHEK " 

LLS CHRLAI CKLPFS VESRKTVMGPQGARRQAFLAFGD VTVDFT 

OKEWRIiLSPAQRALYREVTLENYSHLVSLGILHSKPBLIRRLEQ 

GEV PWGEERRRRPGPCAGI YAEK VLR PKNLGLAHQRQQQLQFSD 

QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 

SQGQRENPTEIDKVLKGIENSRWGAFKCAERGQDFSRKMMVIIH 

KKAHSRQKLFTCRECHQGFRDESALLLHQNTHTGEKSYVCSVCG 

RGFSLKANIjLRHQRTHSGEKPFLCKVCGRGYTSKSYLTVHERTH 

TGE K? YE CQECGRR FN DKSS YNKHLXAHSG E KP F VC KE CGRG YT 

NKSYFWHKRIHSGEKPYRCQECGRGFSNKSHLITHQRTHSGEK 

PFACRQCKQSFSVKGSLLRHQRTHSGEKPFVCKDCERSFSQKST 

LVYHQRTHSGEKPFVCRECGQGFIQKSTLVKHQITHSEEKPFVC 

KDCGRGFIQKSTFTLHQRTHSEEKPYGCRECGRRFRDKSSYNKH 

LRAH LG EKR FFCRDCGRGFTLK PNLT I HQRTHSG E KPFM CKQCE 

KSFSLKANLLRHQWTHSGERPFNCKDCGRGFILKSTLLFHQKTH 

SGB KP F I CSECGQG F I WKSNL VKHQLAHSGKQ P F VCKECGRGFN 

WKGNLLTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWR IHSKE 

KPFVCQECKRGYTSKSDLTVHERIHTGERPYECQECGRKFSNKS 

YYSKHLKRHLREKRFCTGS VGEAS S 


6226 


29 


266 


TKV4 ELLGGSQRLFFLPLWRRLCRdGtGPRVS PMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6-227 


2581 


890 


MSAS SLLEQR P KGQGNKVQNGSVHQKDGLNDDDFE P YLS PQARP 
NNAYTAMSDSYLPSYYSPSIGFSYSLGEAAWSTGGDTAMPYLTS 
YGQLSNGEPKFLPDAMFGQPGALGSTPFLGQHGFN?KPSGIDFS 
AWGNNSSQGQSTQSSGYSSNYAYAPSSLGGAMIDGQSAFANETL 
NKAPGMNTIDOGMAALKLGSTEVASNVPKWGSAVGSGSITSNI 
VASNSLPPATIAPPKPASWADIASKPAKQQPKLKTKNGIAGSSL 
PPPPIKHNMDIGTWDNKGPVAKAPSQALVQNIGQPTQGSPQPVG 
QQAKNS P PVAQAS VGQQTQPLPPPP PQ PAQLS VQQQAAQPTRWV 
APRNRGSGFGHNGVDGKGVGQSQAGSGSTPSEPHPVLEKLRSIN 
NYNPKDFDWNLKHGRVFI I KS YSEDD I HRS I KYNI WCSTEHGNK 
RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VWSQDKWKGRFDVRW I FVKD VPNSQLRH I RLENNENKPVTNSRD 
TQEVPLEKAKQVLKI I AS YKHTTS I FDDFSHYE KRQ 
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ID 
NO: 

6228 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 

nucleotide 
I location 

corresponding 

to first 
1 amino acid 
I residue of 

amino acid 

sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 




47 


1978 " 


GRRCRRRGAVMELAQSARHLGCWAVEEMGVPVAARAPESTLRRI, 
CLGQGADIWAYILQHVHSQRTVKKIRGNLLWYGHQDSPQVRRKL 
E LEAA VTRL RA E I Q ELDQS L E LME RDTEAQDT AM EQ ARQH TQ DT 

ORRALLLRAOACAMRRnnUTT DDDMnOT r\\ir\r nnt Anunn»^ „, , 

^.mnuuunjiyMy^i'uuiyyu 1 U r M(J Kl^ LlKR JjQDMER KAKV 

DVTFGSLTSAALGLEPWLRDVRTACTLRAQPLQNLLLPQAKRG 
S LPTPHDDHFGTS YQQWLS S VETLLTNH P PGHVLAALEHLAAER 
EAEIRSLCSGDGLGDTBISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTIiIilCERQVLTORLQGLVEEVERRVLGSSERQVL 
IIiGLRRCCLWTELKALHDOSQELQDAAGHRQLLLRELQAKQQRI 
LHWRQLVEETQEQVRLLIKGNSASKTRLCRSPGEVLALVQRKW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
1 v ^^ &an vij«^Ai^K^bbFIAI*SHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
QKENLGQALKRLEKLLKQALERIPELQGIVGDWWEQPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALQKLCS 


6229 


1571 




GPSIiLGTRGTPNPARTLQIFFLUGRRLTGRMAAVDDLQFEEFG 
NAATSLTANPDATTVNIEDPGETPKHQPGSPRGSGREEDDELLG 
NDDSDKTELLAGQKKSSPFWTPEYYQTFFDVDTYQVFDRIKGSL 
LPIPGKNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLSNFLI 
H LGEKTYH YVPE FRKVS I AAT 1 1 YAYAWLV P LALWG FLMWRNS K 
vn« x vto xt> jr IjEI VCVYGYSLFIYI PTAILWI I PHKAVRWIIjVMI 
ALG I SGSLLAMTFWPAVREDNRRVAIiATI VTI VLLHMLLS VGCL 
AYF FDAP EMDHLPTTTATPNQTVAAAK S S 


6230 


1723 


600 


skwsgrsgkkkmsklsrsaragvippvgrlmryLkkgtfkyris 
vgap vymaavi eylaae i lelagnaardnkkar iaprhi llava 
ndeelnqllkgvtiasggvlprihpellakkrgtkgksetilsp 

sedgpgdgftilsskslvlgqklsltqsdishigsmrvegivhp 
ttaeidlkedigkalekaggkefletvkelrksqgplevaeaav 
sqssglaakfvihchipqwgsdkceeqleetiknclsaaedkkl 
ksvafppfpsgrncfpkqtaaqvtlkaisahfddssasslfcnvy 
fllfdsesigiyvoemakldak 


6231 


149 


870 


Lilh'SSS TMDRS LRNV LWS FG FLLLFTAYGGLQS LQSSL YS E EG 

lgvtalstlyggmllssmflppllierlgckgtiilsmcgyvaf 
svgnffaswytliptsillglgaaplwsaqctyltitgnthaek 
agkrgkdmvnqyfgi ffli fqssgvwgnlisslvfgqtpsqetl 
pebqltscgasdclmattttnstqr psqglvytllgi ytgsgvl 
avlmiaaflqpirdvqrese 


6232 
6233 | 


3679 
1 1 


1476 

2651 


tVAGTTMAGFWVGTAPLVAAGRRGRWPPQQtMLSAALRTi,KHVL 

yysrqclmvsrnlgsvgydpnektfdkilvanrgeiacrvirtc 

KKMG I KTVAIH S D VDAS S VHVKMADE AVCVG PAPTS KS YLNMDA 
IMEAIKKTRAQAVHPGYGFLSENKEFARCIiAAEDWFIGPDTHA 
IQAMGDKIESKIiLAKKAEVNTIPGFDGWKDAEEAVRIAREIGY 
PVMIKASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKFrDNpRHIEIOVLGDKHGNALWLNERECSIQRRNQKVVEE 
APSI FLDAETRRAMG EQAVALARAVKYSSAGTVE FLVDS KKNF Y 
FLEMNTRLQVEHPVTECITGLDLVQEMIRVAKGYPLRHKQADIR 

ingwavecrvyaedpyksfglpsigrlsqyqeplhlpgvrvdsg 

1QPGSDISIYYDPMISKXITYGSDRTEALKRMADALDNYVIRGV 

thniallreviinsrfvkgdistkflsdvypdgfkghmltksek 
nqllaiassiifvafqlraqhfqensrmpvikpdianwelsvklh 
dkvhtwasiwgsvfsvevdgsklkvtstwnlaspllsvsvdgt 
qrtvqclsreaggnmsiqflgxvykvniltrlaael>tkfmlekv 

TEDTS S VLRS PM PG WVAVS VKPGDAVAEGQE I C VI EAMKMQNS 

MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 

HSTRENLNAGNFNFPSEGHLVRSTGPGGSFAKHMVAQCVSPKGP 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine / G=Glycine, 
H=Histidine, I=Isoleucine # K^Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRTYFFGATHVPYLGGDSKLPKKTEQIRLIiSQiyAAVIEAV 
LAGIACYAKTSSLTKAKEVAEOTLGSGLDS FELI PFKAALRSKM 
TFH IHAVNNQGRI VPLDSEDSbS FVKTACMAVYD I PDLLGGNGC 
liGSWFSESFLTSQILVKEKDGTVTTETSSWLTAAVPRFCSVJL 
VEDNEVfCLSEKTHQAVRGDES FLGTYLTGGEGAYLYS SNLQS VJ P 
E EGNVH F FS SGLLFS HCRHG S III SKDHMNS I S FYDGDSTS TVA 
ALLIDFKSSLLPHLPVHFHGSSNFLMIALFPKSKIYQAFYSEVF 
S LWKQQDNSGI S LKVIQEEK3LSVEQKRLHSS AQKLFS ALSQPAG 
EKRSSIjKLIjSAKLPELDWFIiQHFAISSISQEPVMRTHLPVLLQQ 
AEINTTHRIESDKVIISIVTGLPGCHASELCAFLVTLHKECGRW 
MVYRQIMDSSECFHAAHFQRYLSSALEAQQNRSARQSAYIRKKT 
RLLWLQGYTDVIDWQALQTHPDSNVKASFTIGAITACVEPMS 
CYMEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRKPLLVQLQSL 
I RAAN PAAAF 1 LAENG I VTRNED I EL I LS ENSFSS PE MLRS R YL 
MYPGWYEGKLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKAIQS 
S I KPSPFSGNI YHILGKVKFSDS ERTMEVCYNTIiANS LS IMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWLRQSA 
KQKPQRKALKTRGMLTQQEI RS r HVKRHJjEPLPAGYFYNGTQFV 
NF FGDKTDFHP LMDQFMND YVEE ANRE I E KYNQELEQQE YHDLF 
ELKP 


6234 


1731 


404 


PRVREDMDHKSPGNKGSLVYAGI KS IVKSSLGMVESSRHNWSGL 
DKQS D I QNLNE ERI LALQLCG W I KKGTDVDVGP FLNS LVQEGE W 
ERAAAVALFNLDIRRAIQILNEGASSEKGDLNLNVVAMALSGYT 
DE KNSLWREMCST LRLQIiNN P YL CVMFAFLTS ETGS YDG VLYEN 
KVAVRDRVAFACKFLSDTQLNRY IEKLTNEMKEAGNLEG I LLTG 
LTKDGVDLMESYVDRTGDVQTASYCMI*QGSPLDVLKDERVQYWI 
ENYRNLLDAWRFWHKRAEFDIHRSKLDPSSKPLAQVFVSCNFCG 
KS I SYSCSAVPHQGRGFSQYG VSGS PTKS KVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKLAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNLVPAETV 
OP 


6235 


1 


571 


E KR DHRLPS WPRAALKVPGRGGR VGTTPE LAAGG I MATRN P PPQ 
D YE SDDDS YE VLDLTE Y ARRHQW WN R V FGHS SG PMVE KYS VATQ 
I VMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQI AS HSG YVQ I 
DWKRVEKDVNKAKRQ I KKRANKAAPE INNLI EEATEFI KQN1 VI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARfHAE 
NAI RQKMQAVNFLRMS ARVDAVAARVQTAVTMGKVTKSP1AGVVK 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVMDVNTALQEVLKTALIHDGLARGIREAAKA 
LDKRQAHIX^TLASNCDEPMWKLV^AIKMHQINLIKVBDNKKL 
GEVTVGLCKIDREGKPRKWGCSCVVVKDYGKESQAKDVIEEYFK 
CKK 


6238 


2 


4666 


EBVPTQESVKWEINVIIKNPEIVFVADMTKNDAPALVITTQCEi 
CYKGNLENSTMTAAI KDLQVRACP FLPVKRKG KI TTVLQPCDLF 
YQTTQKGTD PQ V I DMS VKS LTLKVS ? VI I NTM I T I TS ALYTTKE 
TIPEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMIKMNIDSIFIVLEAGIGHRTVPMLLAKSRFSGEGKmfSSL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEIDQTEDFRPWNLGIK 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
LVT4LNNLVKAFT2AATGSSADFVKDLAPFMILNSLGLTISVSPS 
DSFSVLNIPMAJCSYVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFILLTPVKHSTADKIPLTKVGRRIiYTVRHRESGVERSIVCQI 
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SSQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=?henylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine. M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








DT VEGS K KVT I RS P VQ I RNH FS VPLS VYEGDTLLGTAS PE NE FN 
I PLGS YRS FI FLKPEDENYQMCEG I DFSE 1 1 KNDGALLKKKCRS 
KNP S KES FLIN I V PE KDNLTS I»S VYS EDGWDLP YIMHLWP PILL 
RNIjLPYKIAYYIEGIENSVFTLSEGHSAQICTAQLGKARLHLKL 
LDYLNHDWKSEYH I KPNQQDI S F VSFTCVTEMEKTDLD I AVHMT 
YNTGQTWAFHS P Y WMVNKTGRMLQ YKADGI HRKHP PN YKKPVL 
FSFQPNHFFNNNKVQLMVTDSBLSNQFS I DTVGSHGAVKCKGLK 
MDYQVGVTIDLSSFN1TRIVTFTPFYMIKNKSKYHISVAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYKNKQENC 
ILLRLDNELGGIIAEVNLAEHSTVITFLDYHDGAATFLLINHTK 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES E KAE LAEQE I AVALQDVG I S LVNN YTKQE VAY J G I TS SD 
VVWETKPKKKARWKPMSVKHTEKLEREFKBYTESSPSEDKVIQL 
DTNVP VR LT PTGHNMK I LQ PHV I ALRRNY L PALKVE YNTSAHQS 
SFRIQIYRlQIQNQIHGAVFPFVFYPVKPPKSVrMDSAPKPFTD 
VS I VMRSAGHSQISR I KYFKVL 1 QEMDLRLDLGFI YALT13LMTE 
AEVTENTEVELFHKDIEAFKEEYKTASLVDQSQVSLYEYFHISP 
I KLHLS VSLSSGREEAKDS KQNGGLI PVHS LNLLLKS IGATLTD 
VQDWFKLAFFELNYQFHTTSDLQS EVIRKYS KQAI KQM YVLIL 
GLDVLGNPFGL I REFS EG VE AF F YE P YQG A I QG PEEFVEGMAIiG 
LKALVGGAVGC LACAAS K I TGAMAKGVAAMTMDED YQQKRREAM 
NKQPAG FREGI TRGG KGLVSG FVS G I TG I VTKP I KGAQKGGAAG 
F F KGVGKGL VGAVAR P TGG 1 1 DMAS STFQG I KRATETSEVES LR 
P P RF FNEDG VI R P YRLRDGTGNQ MLQKIQ FYRE W I MTHSS S S DD 

VVDVL)\JUL>li,£} UUjiti 


6239 


2108 


634 


K PG MAG KG S SGRR PL-LLGLL VAVATVH LV I C P YTKVEES FN LQA j 
TKDLLYHWQDLEQYDHLEFPGVVPRTFLGPWIAVFSSPAVYVL j 
SLLEMSKFYSQLIVRGVLGLGVIFGLWTLQKEVRRHFGAMVATM : 
FCWVTAMQFHLMFYCTRTLPNVLALPVVLIJUiAAWLRHEWARFI 
WLSAFAIIVFRV3LCLFLGLLLLLALGNRKVSVVRALRHAVPAG 
ILCLGLTVAVDS YFWRQLTWPEGKVLWYNTVLNKS SNKGTS PLL 
P7YFYSALPRGLGCSLLFIPLGLVDRRTHAPTVLALGFKALYSLL 

GHLWNAAYSATALYVSHFNYPGGVAMQRLHQLVPPQTDVLLHI 
DVAAAQTGVSR FLQVNSAWRYDKREDVQPGTGMLAYTH I LMEAA 
PGLLALYRDTHRVLASVVGTTGVSLNLTQLPPFNVHLQTKLVLL 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTSIAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSS LKSAQGTG F3LGQLQS IRSEGTTS TS Y KS LANQ 
TRNGS L S YDS LLTPS DS P D FES VOAGP E PDP PLG YTS P FLSAR L> 
AQQRE AE RHPRL V PTG P THR2PS P VR YDNLSRH I VAS LQE REKL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQPGVSBTEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLG3 
ASPGPGQPPLSS PT RGGVKKVSG VGGTT YE I S V 


6241 


3 


1341 


RNAEEKXRLSLQREKI IARVS I DNRTRALVQALRRTTDP KLC I T 
RVEELTFHLLEFPEGKGVAVKERIIPYLLRLRQIKDETLQAAVR 
EILALIGYVDPVKGRGIRILSIDGGGTRGWALQTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVI VGTVKMS WSHAF YDSQTWENI LKDRWGSALMIETARN P 
TCPKVAAVST I VNRGITPKAFVFRNYGHFPG INSHYLGGCQYKM 
WQAIRASSAAPGYFAEYALGNDLHQDGGLLLNNPSAZiAMHECKC 
L WPDVP LE CIVS LGTGRY E S DVRNTVTY TS LKTKLSNVI NS ATD 
TEE VHI MLDG LLPPDTYFRFNP VMCEN I PLDE S RN E KLDQLQLE 
GLKYIERNEQKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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1 Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alar.ine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, P»Phenylalanine, G=Glycine, 
HoHistidine, I^Ieoleucine, K«Lysine, 
L=Leucine, M*Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SaSerine, T=Threonine, v=Valine, • 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
PFSKL 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARHFPGRSLAAFPRPAAPRRAVEMGE 
SSEDIDQMFSTLLGEMDLLTQSLGVDTLPPPDPNPPRAEFNYSV 
G FKDbNESLNALEDQDLDALMADLVADI SEAEQRTIQAQKESLQ 
NQHHSASLQASIFSGAASU3YGTNVAATGISQYEDDLPPPPADP 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKLALEKLKEAKVKKLV 
VKVHMOTDNSTKSLMVDERQLARDVLDNLFEKTHCDCNVDWCLYB 
IYPELQIERFFEDHENWEVLSDWTRDTENKILFLEKEEKYAVF 
KNPQNFYLDNRGKKESKETNEKMNAXNKESLLEVRLILQSGRKE 
KDVCS I FKS FAS ENNGKI 


6243 


1509 


614 . 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TSRASSRRLACGPQTRAGAETRSTAMIRAKSAARDTRRATCRSA 
AGTPS PTTMTCLTDVPTGCAAVE PTARLPAAAWASTITTGCCPA 
MGQAGAGPAGRKGSEAGGGPGRAHHAHPSPLPREPRVRTG? PAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPIJC 
GPPILAPILSLTPILSRWSCYFPRSRIAQGWHLS 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI IEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
ICISLAFWIISMTASTYYGNLRPISPWRWLFSVWPVLIVSNGL 
KKKSLDHSGALGGLWGFI LTIANFS FFTSLLMFFLSSSKLTKW 
KGEVKKRLDSEYKEGGQRNWVQVFCNGAVPTELALLYK1ENGPG * 
EIPVDFSKQYSASWMCLSliLAALACSAGDTWASEVGPVLSKSSP 
RL I TTWE KV P VGTNGG VT WGLVS S LLGGTFVG I A Y FLTQL I FV 
NDLDISAPQWPI IAFGGLAGLLGS I VDSYLGATMQYTGLDESTG 

MWNS PTNKARHI AGKP ILDNNAVNLFS S VLIALLLPTAAWGFW 
PRG 


6246 


1177 


359 


SliWPWILMDDSLMQISLQLLCVYTANFPNGCSSLCWSSCGQHPV" 
QATHRG A VSNS LM LC I L KLASQM PLENTT VQQMVFMLLS NbALS 
HDCKGVIQKSNFLQNFLSLALPKGGNKHLSNLTILWLKLLLNIS 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
PANKP KI LANE KV ITVLAACLESENQNAQRIG AAALWAL I YNYQ 
KAKTALKS PS VKR RVDEA YS LAKKTFPNSEANPLNA Y YL KCLEN 
LVQLLNSS 


6247 


3 


1678 


NSRWGPWTEP5AGSLRPMARKQNRNSKELGLVPLTDDTSHAGP 
PGPGRALLECDHLRSGVPGGRRRKDWS CSLLVAS LAGAFGS S FL 
YGYNLSWNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 
S I FAIGGLVGTLI VK^JIGKVLGRKHTLllANNGFAISAALL^4ACS 
LQAGAF EMLI VG R FI MG I DGGVALS VLPMYLSE I SP KE I RGSLG 
QVTAI FX CIGVFTGQLLGLPELLGKBST WPYLFGVI WPA WQL 
LSLPFLPDSPRYLLLEKHNFJVRAVKAFQTFLGKAHVSQEVBEVL 
AESRVQRS IRLVS VL3LLRAP YVRWQWTVI VTMACYQLCGLNA 
I WFYTNS I FGKAG I P PAKI P YVTLSTGG I ETLAAVFSGLVI EHL 
GRRPLblGGPGLMGLFFGTLTITLTLQDHAPWVPYLSlVGILAI 
IAS FCSGPGGI P FI LTGEFFQQS QR PAA F 1 1 AGTVNWLSN FAVG 
LL FP P I QKS LDT YCFL VFAT I CI TG A I YL YFVLP ETKNRTYAEI 
SQAFSKRNKAYPPEEKIDSAVTDGKINGRP 


6248 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLV1LSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKKMNIjEGS I QDLFELFS SNENQPLTTKVCWP 
SQPWBLVLMKVLGACKLLLRLLDCCCKTFLLl'VKKI^LQEFII 
LNLVMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQS PRAS EETLLG I SKKAKQMKINVQNNVDLGQP VKNKRVF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HsHistidine, I**Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknovm, *=Stop 
Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








KEESSBFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK " 

VIGTPHAXSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 

FLGNKIiLKSNRLKHIiEAQGTSLPKKLECIKTSICNHLLRGSGIK 

TSKHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 

ATDTSAKWRLSHCTVHRTDIjYPNSKQLLNSGVSMPVlQTKEKMI 

H ENLRG IHE NETDS WTVMQ I NKNSTSGT I KETDDI DDI FALMGV 


6249 


56 


1773 


VP PPRMMAAVPPGLE PWNRVRI PKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVILSLKSOTLDAETDVTiCAVI»YQfJHNPMr"DiJVDUT 
ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
SQ P WELVLMKVl^ACKLLLRLLDCCCKTFLLTVKHLGLQEFI I 
LNLVMVGLVSRLWVLYKGVLKRLILLYBPLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQSPRASEBTLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIOMAVVWCRSKKLKAQAI 
F LGNKLLKSNRLKHLEAQGTS L P KJCLEC I KTS I CNHLLRG SG I K 
TSKHHLRORRSONKFLRRORKPnP n^QTT T DPTrvMrcnmrnt/*? 

ATDTSAKWRLSHCTVHRTDLYPNSKQLLNSGVSMPVIQTKEKMI 
H ENLRG I HENETDS WTVMQI NKNS TSGTI KETDDI DD I FALMGV 


6250 


232 


1306 


LAALHIHALPFRKDLEKYKDLDEDELljQNLSETELiKQLETVIaDD 
LDP ENALL PAG FRQKNQTS KS TTG P FDREHLLS YLE KEALEH KD 
REDYVPYTGEKKGKIFIPKQKPVQTFTEEKVSLDPELEEALTSA 
SDTELCDLAAI LGMHNLITNTKFCNI MGSSNGVDQEHFSNWKG 
EKI LP VFDE P PN PTNV K E T ,K R T Y RMTHiU X .w\nar JJM T itm t n t n 
TLK DFAKALETNTHVKC FS LAATR S NDPVATAFAEMLKVN KTLK 
S LNVESNF I TG VG I LAL I DALRDNETLAE LK I DNQRQQLG TAV E 
LEMAKMLEENTN I LKFG YQFTQQG P RTRAANAI TKNNDL VR KRR 
VEGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE 
PRGCG AAPGRTLH FTAAVPAGHNKWS KVRH I KG P KD VER SR I FS 
KLCLNIRLAVKEGGPNP EHNSNLAN I LEVCRS KHMPKST IETAL 
KMEKSKDTYLLYEGRGPGGSSLLIEALSNSSHKCQADIRHILNK 
NGGV>IAVGARHSFDKKGVIVVEVEDREKKAVNLERALEMAIEAG 
AEDVKETEDEE ERNVFK F I CDASS LHQ VR KKLDS LG LC S VS CAL 
E FI PNS KVQLAE PDLEQAAHLI QALS NHEDV I HVYDNI B 


6252 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
BTVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLEVAEAEEEETS IKAARSELIiLAEE PG FLEGE 
DGEDTAKICQADIVEAVDIASAAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGH VAALDWVTKKLMC E I NVMEAVRD I RFLH S EALL 
AVAQNR WLM I YDNQG I ELHC I RRCDRVTRLEFLP FH FLLATAS E 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYKAVIHLGHSNG 
TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KIFDLRGT YQPLS TRTLPHGAGHLAFSQRGLLVAGMGDWN I WA 
GOGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGAR PS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGAS PGPPRNKKNRELRPQRPKNAYILKKSR I SKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLEVAEABEEETS I KAARS ELLLAE E PGFLEG E 
DGEDTAKICQADIVEAVDIASAAKHFDLNLRQFGPYRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCEINVMEAVRDIRFLHSEALL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid cegment containing signal peptide " 
(AWUanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I=rsoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, O^Glutamine, RsArginine, 
S=Serine, T=Threonine, V= Valine, 

W=Tryptoohan . Y=TvroR i n#=> y^tt^™™-^ * oi 

Codon, /.possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFIjPFHPLLATASE i 
TG FLTYLDVS VG K I VAALN ARAGRLDVMSQN PYNA V I H LGHS NG 
TV^LWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 

ormvrv>ftUDr«f lAjJbriSJNy iKoKMjKOEWEVKAT tTjEKVPAELIC 1 

LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 

SSTASLVKRKRKVMDEEHRDKVRQSLCXJQKHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALG RRGGS Q E LSAAACGCFALR LRAPGS GR PALA PG AAA FAGL 
GGAPRFPPRGSAAGRTMLLKEYRICMPLTVDEYKIGQLYMISKH 
SHEQSDRGEGVEWQNEPFEDPHIIGNGQFTEKRVYLNSKLPSWA 
RAWPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIHIETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
SBKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
HXWRDI 1»L IGHRQA FAWVDE W Y DMTMDDVRE YE KNMHEQTN I K 


6255 


1 


1444 


PTRPQQELLVSLATVI FVASQKALS VESKAVI KQQLESVSNGWT | 
VYRIARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSLKEFS 
HAEQCLTGLQEENYSSALSCI7VESLKFYHKGIASLTAASTPLNP 
hS FQCE FVKL R I DLLQAFSQL 1 CTCNS LKTS PP PAIATT I AMTL 
GNDLQRCGRI SNQMKQS MEEFRSLASRYGDLYQAS FDADS ATLR 
NVELQQQSCLLI SHAI E ALI LDPESAS FQEYGSTGTAHADS E YE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIALLKVPL 
SFQRYFFQKLQSTSIKLALSPSPRWPAEPIAVQNNQQLALKVEG 
WQHGSKPGLFRKIQS VCLNVSSTLQSKSGQDYKI PIDNMTNEM 
EQR VE PHNDY FS TQ FLLN FAI LGTHN I TVESS VKDANG I VW KTG 
PRTTIFVKSLEDPYSQOIRLQQQQAQQPLQOQQQRNAYTRF 


6256 


1 


1542 


CRGAGAE PAAN P RS P R S L VPS LES TS TS V P PAPGTMATDS W ALA 
VDEQEAAAESLSNLHLKEEKIKPDTNGAWKTNANAEKTDEEEK 
EDRAAQSLLNKLI RSNL VDNTNQVEVLQRDPNS PLYS VKSFEEL 
RLKPQLLQdVYAMGFNRPSKIQENALPLMLAEPPQNLIAQSQSG 
lurv.AAiU' VJjAMbSQVEPANKYPgCLCLSPTYEIiALQTGKVIEQM 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKLK 
FIDPKKIKVFVLDEADVMIATQGHQDQSIRIQRMLPRNCQMLLF 
SATFEDSVWKFAQKWPDPNVIKLKREEETLDTIKQYYVLCSSR 

DEKFOALCNLYfiA TTTBnRMT BTTITP VT& OUTT t\ 7\ »t c? tri?/-iTi*-»i t-tl 1 

LLSGEMMVEQRAAVIERFREGKBKVLVTTNVCARGIDVEQVSVV 
INFDLPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDSKHSM 
NILNRIQEHFNKKIERLDTDDLDEIEKIAN | 


6257 


210 


615 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 
NNIVKEELALLDGSNWFKLLGPVLVKQELGEARATVGKRLDYI 
TAEI KR YESQLRDLERQS EQQRETLAQLQQE FQRAQAAKAGAPG 
KA 


6258 


210 


615 


afipamaeliqkklo^evekyqqi^kdlsksmsgrqkle'aqLteH 

NNIVKE ELALLDGS NWFKLLGPVLVKQELGEARATVGKRLDY I 

TAEIKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA ' 1 


6259 


2 


1540 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNNKTV I 
S VENGDRGS KTFNLGTDP VSLRNYPYK I CDS CEMNLKNI SGL 1 1 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPSFGQSFEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTF I ESLKLNISQRPHLEMEP YGCS I CGKS FCMNLRFGH 
QRALTKDNPYEYNEYGEIFCDNSAFIIIIQGAYTRKILREYKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ | 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

lOCdLlOn 

corresponding 
to first 
amino acid 
residue of 

sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine. D=Aspartic Acid, e= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=fclutamine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGEKPYECKQCGKTFCVKSNLTEHQRTHTGEKP " 
YE CNAOG KS FCHRS ALTVHQRTHTGEKP FI CNECGKS FC VKS NL 
IVHQRTHTGEKPYKCNECGKTFCEKSAIjTKHQRTHTGEKPYECN 
ACGKTFSQR3 VLTKHQR IHTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMALREIjKVCLLGDTGVGKSSIVW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMYYRGSAAAI I VYDI TKEETFSTLKNWVKELRQHG PPN I 
WAI AGNKCDLIDVREVMERDAKDYADS IHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1168 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR ' 

SPACEPCRPDFAPRPALLLRSGPRSAPAVTGKPALKGQPGPWPG 

MAE VS I DQS KLPGVKE VCR DFAVLEDHTLAHSLQEQE I EHHLAS 

NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDLEQQDCEIA 

QEIQEKLAIEAERRRIQEKKDEDIARLLQEKELQEEKKRKKHFP 

EFPATRAYADSYYYEDGGMKPRVMKBAVSTPSRMAHRDQEWYDA 

BIARKLQEEELLATQVDMRAAQVAQDEEIARLLMAEEKKAYKKA 

KERE KS SLDKR KQDPE WKP KTAKAANS KS KBSDE PHH S KNERPA 

RPPPPIMTbGEDADYTHFTNQQSSTRHFSKSESSHKGFHYKH 


6262 


2 


1759 


PECHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEEIL 
GSTRLVSQGLEALRSEHQAVLQSIiSQTIECljQQGGHEEGLVHEK 
ARQLRRSMEN I ELGLSEAQVMLALASHLSTVES E KQKLRAQVRR 
LCQENQWLRDELAGTQQRLQRS EQAVAQLEEEKKHLE FLGQLRQ 
YDEDGHTSBEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGG Y E I PAR LRTIjHNLVI QYAAQGRYEVAVP LCKQALE DL ER 
TSGRGHPDVATMLNILALVYRDQNKYKEAAHLIiNDAIjSIRESTL 
G PDH P AVAATLNNLAVL YGKRGKYKEAEPLCQRAIiE I REKVliGT 
NH PDVAKQLNNLALLCONQGKYEAVERY YQRALAI YEGQLG PDN 
PNVARTKNNLASCYLKQGKYAEAETLYKE I LTRAHVQEFGS VDD 
DHKPIWMHAEEREEMSKSRHHEGGTPYAEYGGWYKACKVSSPTV 
NTTLRNLGAL YRRQGKLEAAETLEE CALRSRBQGTDP I S Q TKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSIiGKIRDVLRR 


6263 


1 


2408 


RELDSIiADLPERIKPPYANGLSTSHLRSSSVEDVKlrllSEGRPT 
IEVRRCSMPSVICEHTKQFQTISBESNQGSLLTVPGDTSPSPKP 
EWSNVPERDLSNVSNIHSS FATSPTGASNSKYVSADRNLI KNT 
APVNTVMDS'PVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDFIC 
PNSN I PDQESS LQS FCNSENKVLKENAD FLSLRQTEL PGNS CAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 
KNAVQ 1 1 SSALDTDNESTKDTEbTCFVLGDVQKTDAFVP VYSDS T 
IQEAS PNFEKAYTLPVLPSEKDFNGSDASTQLNTKYAFSKLTYK 
SSSGHEVENSTTDTQVISHEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNEQDPKHCPESEKCLLSIEDEESQQSILSSLENHSQ 
QSTQ PEMHKYGQLVKVELEENAEDDKTENQI PQRMTRNKANTMA 
NQSKQILASCXLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPOPVOVS PSLLOAKEKTOriQT JVATVnQT.Pf T.ni? move crta 
ANP Y F E YLH I R KKI E EKRKLLCS V I PQ APQYYDE YVTFNGS YLL 
DGNPLSKICIPTITPPPSLSDPLKBLFRQQEWRMKLRLQHSIE 
REKLIVSNEQEVLRVHYRAARTLANQTLPFSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 
QHEAAALNAVQRLEWQLKLQELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTP1 


6264 " 


143 


1960 " 


KHR0EWNALDMAPEIHMTGPMCLIENTNGELVANPEALK1LSAX 
TQPVVWAI VGLYRTGKS YLMNKLAGKNKGFSLGST VKSHTKG I 
WMWCVPHPKKPEHTLVLLDTEGU3DVKKGDNQNDSWIFTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPOFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine. 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine f T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRl,CIRKFFPKKKCFVPDLPIHRKKLAQLEkLQDE 
ELDPEFVQQVADFCS Y I FSNS KTKTLSGGI KVNGPRLESLVLTY 
I NAIS RGDLPCMENAVLALAQ I EN SAAVQKAI AHYDQQMGQKVQ • 
LP AETLQE LLDLHR VS EREATEVYMKNS F KDVDH LFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQVI FS P LEEEVKAG I YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEII^TYLKSKESVTDAIL 
QTDQ I LTE KEKEI EVECVKAESAQASAKMVEEMQI K YQQMMEE K 
EKSYQBHVKQLTEKMERERAQLLEEQEKTLTSXLQEQARVLKER 
COGESTQLQNEIQKLQKTLKKKTJCRYMSHKLKI 


6265 


14 3 


1960 


KHRQENNALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPVVWAlVGLYRTGKSYLMNKIiAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPE^iTLVLLDTEGLGDVKKGDNQNDSWIFTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVS FFPD FVWTLRDFS LDLE ADGQ PLT P DE YLE YS L KLTQGT 
SQKDKNFIHjPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKIiQDE 
E LD PE FVQQVADFCS Y I FSNS KTKTLSGG I KVNG PRL E S LVLT Y 
INAI S RGDL P CMENAVIiALAQ I ENS AAVQKAIAH YDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNS FKDVDHLFQKXLAAQL 
DKKRDDFCKQNQEASSDRCSALLQVT FS PLEEEVKAGI YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQI LTE KEKE IBVECVKABSAQASAKMVEEMQI KYQQMMEEK 
EKS YQEHVKQLTE KMERERAQLLEEQEKTLTS KLQEQARVLKER 
COGESTQLQNE IQKLQKTLKK KTKRYWSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYSLQNRXPSLYGSLTCQG IGLDG 1 PEVTASE 
GFTVNEINKKSIHISCPKENASSKFLAPYTTFSRIHTKSITCLD 
ISSRGGLGVSSSTDGTMKIWQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKI WS AED AS CWT FKGHKGG I LDTAIVDR 
GRNWSASRDGTARLWDCGRSACLGVIiADCGSSINGVAVGAADN 
SINLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRQLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGAPVLSLLSVRDGFrASQGDGSCFIVQQDLDYVTELTGADCD 
PVYKVATWEKQIYTCCRDGLVRRYQLSDL 


" 62^7 


3 


622 


LGMMKKNNSAKRGPQDGNQQPAPPEKVGWVRKFCGKGI FREI WK ' 
NR YWLKGDQL YI SE KE VKDE KN I QE V FDLS D YEKCE ELR KS KS 
RSKKNHS KFTLAHSKQPGNTAPNL I FLAVS PEEKES WINALNSA 
ITRAKNRI LDEVTVEEDS YLAHPTRDRAKIQHSRRP PTRGHLMA 
VASTS TSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


1368 


HRELCQNLPAGLSSALI DNPLTLLLS I DT YVMLQE P VT FQDVAV 
DFSREEWGLLGPTQRTEYRDVWLETFGHLVSVGWETTLENKELA 
PNSDIPEEEPAPSLKVQESSRDCALSSTLEDTLQGGVQEVQDTV 
LKQMES AQE KDLPQKKH FDNRES QANS GALDTNQ VS LQKI DNP E 
SQANSGALDTNQVLLHKIPPRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
IFRNPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 
vUlxtre C^vciLbKI r WUK5AISQHLRTHTGAKPYKCQDCGKAFRQ 
S S H L I RHQRTHTG ER PYACNKCGKAFTQSSH L I GHQRTHNRTKR 
KKKQPTS j 


6269 


2886 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL " 
TQ YVKWTNDKS LGG I EG CLS KLKAAD PTFVMGHAMATGL VLI GT 
GSS VKLDKELDLAVKTMVE I SRTQPLTRREQLH VSAVETFANGN 
FPKACELWEQ I LQDHPTDMLALKFSHDAYF YLG YQEQMRDS VAR 
I YP FWTPDI PLSS YVKGI YS FGLMETNFYDOAEKLAKEALSINP 
TDAWS VHTVAH I HEMKAE I KDGLEFMQHSETLWKDSDMLACHNY 
WHWALYLIEKGEYEAALTIYDTHILPSLQANDAMLDVVDSCSML 
YRLQMEGVSVGQRMQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQELLTTLRDASESPGENCQHLLARDVGLPLCQALVEAE 
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SEQ 
10 
NO: 



6270 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



"23" 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



-2086- 



6271 



32 



6272 



10*8 



Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-pos3ible nucleotide ins ertion) 

DGNPDRVLKbLLPIRYRiVQLGGSNAQRDVFNQLLIHAALNCTS"' 

SVHKNVARSLLMERDALKPNSPLTERL IRKAATVHLMQ 

SVTVTLGSEGJX>RPPTYHLEEMEQEPQNGEPAEIKI IREAYKKA* 

FLFVNKGLNTDELGQKEEAKNYYKQG1GHLLRG1SISSKESEHT 

GPGWESARQMQQKMKETLQNVRTRLEILEKGLATSLQNDLQEVP 

KLYPEFPPKDKCEKLPEPQSFSSAPQHAEVNGNTSTPSAGAVAA 

PASLSLPSQSCPAEAPPAYTPQAAEGHYTVSYGTDSGEFSSVGE 

E FYRNHS Q P P PLE TLGLDADEL I LI PNG VQ I FFVNPAG E VS A PS 

YPGYLRIVRFLDNSLDTVLNRPPGFLQVCDWLYPLVPDRSPVLK 

CTAGAYMFPDTMLQAAGCFVGWLSSELPEDDRELFEDLLRQMS 

DLRLQANWNRAEEENEFQIPGR7RPSSDQLKEASGTDVKQLDQG 

NKD VRHKG KRG KRAKDTSS EE VNLSH I V p CE P VPBE K P XELPE W 

SEKVAHNILSGASWVSWGLVKGAEITGKAIQKGASKLRERIQPE 
E K P V BVS P AVT KG L Y I AKQATGGAAKVSQ FLVDGVCTVANC VG K 
ELAPHVKKHGSKLVPESLKKDKDGKSPLDGAMWAASSVQGFST 
VWQGLECAAKCIVNNVSAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYNINNIGIKAMVKXTATQTGHTLLEDYQIVDNSQRENQ 
EGAANVNVRGEKDEQTKEVKE AKKXDK 
. GCG VKTAGMVGREKELS IHFVPGS CRLVEEE VNI PNRRVLVTGA - 
TGLLGRAVHKEFQQNNWHAVGCGFRRARPKFEQVNLLDSNAVHH 
IIHDFQPHVIVHCAAERRPDWENQPDAASQLNVDASGNLAKEA 
AAVGAFL I Y I S SDYVFDGTNPP YREED I PAPLNL YG KT KLDG KK 
AVLENNLGAAVLRI P I LYGEVEKLEESAVTVM FDKVQFSNKSAN 
^HWQQRFPTHVKDVATVCRQLAEKRMLDPSIKGTFHWSGNEQM 
TKYEMAC A I ADAFNL PS SHLRPITDS P VLGAQRP RNAQ LDCS KL 
1 ETLGIGQRTPFRIGIKESLWPFLIDKRW RQTVFH 



"528" 



6273 



256 



6274 



6275 



20 



T27T 



797 



843 



GAVWEDAAAPGRTEGVLERQ GAPPAAQQGGALVHLTPTPGGLAL 
I V S P YHTHRAGDPLDLVAIiAEQVQKADEFIRANATNKLTVIAEQI 
QHLQEQARKVLEDAHRDANLHHVACNIVKKPGNIYYLYKRESGQ 
QYFSIISPKEWGTSCPHDFLGAYKLQHDLSWTPYEDIEKQDAKI 
SMMPTLLSQSVALPPCTEPNFQGLTH 



1142 



£JCP R VS P E CRS LGCQ VMFSLPLNCS PDHI RRGS CWGRPQDLK I A 
SAAWNS KCHPGAGAAMARQHARTLW YDRPRYVFME FCVEDSTDV 
HVLIEDHRIVFSCKNADGVELYNEIEFYAKVNSKDSQDKRSSRS 
I TC F VRKWKE KVAW PRLTXEDI KP VWLS VDFDNWRDWEGDE EME 
LAHVEHYAEVRDNTYCVLPT 



AAAAMAAAAGGGAG AARS LS RFRG CLAGAIjLGDCVGS F YEAHDT - " 

VDLTSVLRHVQSLEPDPGTPGSERTEALYYTDDTAMARALVQSL 

LAKEAFDEVDMAHRFAQEYKKDPDRGYGAGVVTVFKKLLNPKCR 

DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYSS VQDVQKFARLS 

AQLTHASSLGYNGAILQAIjAVHIiALQGESSSKHFLKQLIiGHMED 

LEGDAQSVLDAREIiGMEERPYSSRLKKIGELLDQASVTREEWS 

ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSLQRTLIYSI 

SLGGDTDTIATMAGAIAGAYYGMDQVPESWQQSCEGYEETDILA 
QSLHRVFQKS 



565 



SRRGRARCLAh^KKPVPRPAKTiNlAFM ViCTMVGGQLKNLTGSLQ 

GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 

KAERATLRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI 

BEDTEEEEEKASVLGQLASLPGLNLGSLKDKAQATLGDLKQSAE 
KCHVM 



TLLPliPPLPDTEGMILLNTGLEGTVAENPVPIVHTPSGNILTLE" 
S CL0QLATH PGHWG I HLQIAE PAALRP SLALLARLS S LGLLH WP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LG SG YR EQLLTDML E LCQGLWQP VS FQMQAMLLGHS TAG A IGR L 

LASSPRATVTVEHNPAGGDYASVRTALIiAARAVDRTRVYYRLPQ 
1 GYHKDLLAHVGRN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C-Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTEMG LY YS YFKT I VEAPSFLNG V WM I MNDKLTE Y P L V I NT 
LKRFNLYPBVILASWYRIYTKIMDLIGICTTKICTnVTIGEGLSP 
TESCEGLGDPACFYVAVIFILNGLMMALFFIYGTYLSGSRLGGL 
VT VLCFFFNHGE CTR VM WTP P LRES FS YPFLVLQMLL VTH I LRA 
TKLYRGSLIALC I SNVFFML PWQFAQFVLLTQI ASLFAVYWGY 
IDICKLRKI 1 YIHMISLALCFVLMFGNSMLLTSYYASSLVI I WG 
ILAMKPHFL KI NVSELS LWVI QGCFWLFGTVI LKYLTS K I FG I A 
NDAH I GNLLTS K? FS YKD FDTLLYTCAAEFDFME K3TF LR YT KT 
LLLP WLVGFVA I VRKI I S DMWGVLAKQQTHVRKHQFDHGEL VY 
HALQLLAYTALG I LIMRLKLFLTPHMCVMASL tCSRQLFGWLFC 
KVHPGAI^AILAAMSIQGSANLQTQWNIVGEFSNLPQEELIEW 
I KYSTKPDAVFAGAMPTMAS VKLSALR P I VNHPH YEDAGLRART 
KIVYSMYSRKAAEEVKRELIKLKVNYYILEESWCVRRSKPGCSM 
P E I WDVE DPANAGKTPLCNLL VKDSKPH FTTVFQNS VYKVLE W 
KE 


6278 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVLIVEGQQYAWGTVLLL 
IRIILEYCQGVDNrPSVTTDMLTRLSDLLKYFNSRSCQLVLGAG 
ALOWGLKTITTKNLALSSRCLQLIVHYIPVIRAHFEARLPPKQ 
YSMLRHFDHITKDYHDHIAEISAKLVAIMDSLFDKLLSKYEVKA 
P VPS ACFRNICKQMT KMHE A I FDLLP EEQTQMLFLR INAS Y KLH 
LKKQLSHLNVI NDGG PQNGLVTAD VAF YTGNLQALKGLKDLDLN 
MAEIWEQXR 


6279 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTL - 
LRSTAKMPTTPVKAKRVSTFQE FESNTS DAWDAGEDDDELLAMA 
AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEELRR 
LS WSG I PKPVRPMTWKLLSGYLPANVDRRPATLQRKQKE YFAFI 
EHYYDSRWDEVHQDTYRQIHIDIPRMSPEALILQPKVTEIFERI 
LFIWAIRHPASGYVQGINDLVTPFFWFICEYIEAEEVDTVDVS 
GVPAEVLCNIEADTYWCMSKLLDGIQDNYTFAQPGIQMKVKMLE 
ELVSR IDEQVHRHLDQHE VR YLQFAFRWMNNLLMREVPLRCTI R 
L WDTYQS E PDGFSH FHL YVCAAFLVRWR KE I LEEKD FQE LLLFL 
QNLPTAHWDDEDISLLLAEAYRLKFAFADAPNHYKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
MDGRLGDRYNPPVDATPDTRELEFNEIKTQVELATGQLGLRRAA 
QKHS FPRMLHQRERGLCHRGS FSLGEQSR VI SHFLPNDLGFTDS 
YSQXAFCGI YSKDGQ 1 FM5ACQDQTI RLYDCR YGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LR PDERRFAVFS IAVS SDGREVLGGANDGCL YVFDREQNRRTLQ 
IESHBDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
P VGALAGHQDG I TF I DS KGDAR YL I SNS KDQT I KLWDI RRFSSR 
EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
vvijHiLaKLKlfSPIHSTGQQFIYSGCSTGKVVVYDLLSGHIVKK 
LTNHKACVRD VS WHP F E EKI VS S S WDGNLRLWQYRQAEY FQDDM 
PBSEECASAPAPVPQSSTPFSSPQ 


628! 


857 


2515 


eccdqkmgsrnsssagsgsgdpseglprrgaglrrseebeeede 
dvdlaqvlayllrrgqvrlvqgggaanlqfiqalldsebendra 
wdgrlgdrynppvdatpdtrelefneiktqvblatgqlglrraa 

QKHS FPRMLHQRERGLCHRGS FS LGEQSRVISH FLPNDLG FTDS 
YSQKAFCGI YSKDGQI FMSACQDQTI RLYDCR YGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
PVG ALAGHQDGI TF I DS KGDAR YLI SNS KDQT I KLWDI RR F S S R 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline. Q^Glutamine, R=-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWCX)VPKKAWRKLKIiPGDSSLMTyRGH 
GVLHTLIRCRFSPIHSTGGQFIYSGCSTGKVWYDLLSGHIVKK 
LTNHKACVRDVSWHPFBEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


125 


906 


RMAACRALKAVLVDLSGTLHI BDAAVPGAQEALKRLRGASVI IR 
FVTNTTKESKQDLLERI*RKLEFDISEDEIFT<;t,TAAR<;t t FPim 
VRPMLLVDDRALPDFKG IQTSDPNAVVMGLAPEHFHYQ I LNQAF 
RLLLDGAPL IAI HKARYYKRKDGLALGPGPFVTALEYATDTKAT 
VVG KPEKTFFLEALRGTGCEPE EA VM I G DDCRDD VGGAQDVGML 
G I L VKTG KY RAS DEE K I NP P PYLTC ES F PHAVDHI LQHLL 


6283 


140 


1043 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPSIPKNALPITKPTSPAPAAQSTNGTHASYGPFYLEYSLL 
AEFTL WKQ KL PG VYVQ PS YRSALMW FG VI F I RHGLYQDG VFK F 
TVYIPDNYPDGDCPRLVFDIPVFHPLVDPTSGELDVKRAFAKWR 
RNHNHIWQVLMYARRVFYKIDTASPLNPEAAVLYEKDIQLFKSK 
WDSVKVCTARLFDQPKIEDPYAISFSPWNPSVHDEAREKMLTQ 
KKKPEEQHNXSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


1 


2879 


RS VI PGSTI SSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQVERENVQKRTFTRWINLHLEKCNPPLEVKDLFVDIQDGKI 
LMALLEVLSGRNLLHEYKS SSHRI FRLNNIAKALKFLEDSNVKL 
VSIDAAE1ADGNPSLVLGLIWNIILFF0IKELTGNLSRNSPSSS 
liAPGSGGTDSDSSFPPTPTAERSVAISVKDQRKAIKALLAWVQR 
KTRK YGVA VQD FAGS WR SG LAFLAVI KAI DPS LVDMKQALENST 
RENLE KAFS I AQDALH I PRLLE PED I MVDTPDEQS I MT Y VAQFL 
ERFPELEAEDI FDSDKEVP I ESTFVR I KETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRLDGVSSHALSDS 
STEFMHQIIDQVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 

LPlKKTVHFEADTYK'nDPrQTfKrT CT r^WCC-rimr* v~n>r*r rtoivM*... 

LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 
CNGALES TARHDE ESHSLSPPG ENTVMADS FQ I KVNLMTVE ALE 

EGDYFEAIPLKASKFNSDLIDFASTSQAFNKVPSPHETKPDEDA 
E AFENHAE KLGKRS I KS AH KKKDS P E PQ VKMD KH E PHQDSGE E A 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSS6SSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
LAPHEDHQQRBTKENDPMDSHQSQESPNLEMIANPLEENVTKES 
ISSKKKEKRKHVDHVESSLFVAPGSVQSSDDLEEDSSDYSIPSR 
TSHSDSS I Y LRRHTHRSSESDHFSLCS VEERSRSG 


6285 


2157 


1*31 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAAFIFSYITAVTLHII - 
I D PALP Y I SDTGTVAP EKCLFG AMLN I AAVLCIAT I YVRYKQ VH 
ALSPEENVI IKLNKAGLVLGILSCLGLS I VANFQKTTLFAAHVS 
GAVLTFGMGSLYMFVQTILSYQMQPKIHGKQVFWIRLLLVIWCG 
VS ALSMLTCS SVLHSGNFGTDLEQKLHWNPEDKG YVLHM ITTAA 
EWSMSFS FFGFFLT YI RDFQKI SLRVEANLHGLTLYDTAPCP IN 
NERTRLLSRDI 


6286 


1619 


274 


KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
PFSIPAASEIADLSNIINKLLKDKNEFHKHVEFDFLIKGQFLRM 
PLDKHMEMENISSEEWEIEYVBKYTAPQPEQCMFHDDWISSIK 
GAEEWILTGSYDKTSRIWSLEGKSIMTIVGHTDWKDVAWVKKD 
SLSCLLLSASMDQTILLWEWNVERNKVKALHCCRGHAGSVDSIA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRFCKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEEICSASWDHTIRVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVSLSLTS HTGWVTS VKWS PTHEQQLI SGSLDNI VKLWDT 
RSCKAPLYDLAAHEDKVLSVDWTDTGLLLSGGADNKLYSYRYSP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, (^Cysteine, P^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine f 
H=Histidine, I=Isoleucine, K=Lysine, 
bsLeucine, N=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=*Arginine, 
S*Serine, T=Threonine, V^Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 
TTSHVGA 


6287 


. 27* 

V 


1482 


MQFFFNFQ1GLRSTSGKEKYSGDAGPLGDALQLPLQCLALDEDF 
APAKLQVQKILCDLLLPENLKEGLKBSSWSSLPCTKNRPFDFHS 
VMEESQSLN2PSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
M P ARE DCLKR VS S E P VLS VQE KG VLLKR KLSLLEQD V I VNEDGR 
NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFEr<;T,rMpr ffp 
P VTT PCGHS FCKNCLERCLDHAP YC PLC KES LKE YLADRR YCVT 
QLLEELIVKYLPDELSERKKIYDEETAELSHLTKNVPIFVCTMA 
YPTVPCPLHVFEPRYRLM1RRSIQTGTKQFGMCVSDTQNSFADY 
G CMLQ I RNVH FL PDGRS WDT VGG KRFR VLKRGMKDG YCTAD I E 
YLEDV 


6286 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKir;PQTT,Kic;nT awt r" n err P — 

MLDSGADYLHliDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 

HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIXDIRENGM- 

KVGLAIKPGTSVEYLAPWANQIDMALVMTVEPGFGGQKFMEDMM 

PKVHWLRTQFPSLDIEVIXX3VGPDTVHKCAEAGANMIVSGSAIM 

RSEDPRSVINLLRNVCSEAAQKRSLDR 


6289 


1 


743 


VTLYPCRGLVGNIiLGASGMASGCKIGPSILNSOLANLGAECLR 
MLDSGADYLHLD VMDGHFVPNI TFGHP WES LR JCQLGQDPFFDM 
HMMVS KPEQWVKPMAVAGANQYTFHLEATEWPGAL1 KDI RRNGM 
KVGLAIKPGTSVEYLAPWANOIDMALVMTVEPGFGGQKFMEDMM 
PKVHWLRTQFPSl^IEVL^VGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


3 


1856 . - 


iixSRWLLGVYEWAPTLACLPRPRLRRRRRRRRRRMJSRYXRKA 

VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 

DITRESS FTSADTGNS LS AF PS YTGAG IS TEGSS DFS WG YGELD 

QNATEKV0TMFTAIDETJ,YFOKT.£VWTVCT/^iri?rvv^ma»* <- Pn i.T 
** » *» »*■ in^umju i &yt\ uj "rix j\oijy£,tCQQWTASFPHLj 

RILGRQI ITPSEGYRLYPRSPSAVSAS YETTLSQERDSTI FGIR 
GKKLHFSSSYAHKASS I AKSSSFCSMERDEEDS I I VSEGI 1EEY 
LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
VFDSVWCKWS CMEQLTRSHWEGFASDDESNVAVTRPDS ESS CV 
LSELHPLVLPRVPQSKVLYITSNPMSbCQASRHQPNVKDLLVHG 
MPLQP RNLS LMD KL LDLDDKLLMR PG S STI LSTRN W PNRAVE FS 
TSSLS YTVQS TRRRNPP PRTLHP I S TS HS CAE TPRS VE E I LRGA 
RVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVEHVSTVGPQRQMK 
PHGDSSRAQSAWDEPNYQOPOERLLLPDFFPPPMTTnQPT t nT 

QYRRSCAVEYPHQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 

P * ' 


6291 


1732 


602 


LVAKMASSASARTPAGKRVINQEELRRLMKEKQRLSTSRKRIES 
PFAKYNRLGQLSCALCNTPVKSELLWQTHVLGKQHREKVAELKG 
AKEASQGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSIiLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKJ3QMDKEWDEFQKAMRQVNTISEAIVAEEDEEGRLDRQ 
IGEIDEQIECYRRVEKLRNRQDEIKNKLKEILTIKELQKKEEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6292 


1835 


1142 


TCPGAMKMVAPWTRF YSNSCCLCCHVRTGTI LLGVW YLI INAW 
LLILLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
L I CAMATYGA YKQRAAW 1 1 P F FCYQ I FD FALNMLVAI TVL I Y PN 

SIQEYIRQLPPNFPYRDDVMSVNPTCLVLIILLFISIILTFKGY 

LISCVWNCYRYINGRNSSDVLVYVTSNDTTVLLPPYDDATVNGA 
AKEPPPPYVSA 


6293 


2382 


1035 


tWCTLGTVDVHPIGWCAINSKILVPPRTIHAKFTDWKGYLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLEWDKSQVSRTRMAW 
DTVIGGRLRLLYEDGDSDDDFWCHMWS PLIHPVGWSRRVGHGI K 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amano acxd segment containing signal peptide" 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine f I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q-Glutamine, R^Arginine, 

S=Serine TsThreon i no \i—\r*%'ii~.~. 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=posoible nucleotide deletion, 
\=poosible nucleotide insertion) 




> 




MSERRSDMAHHPTFRKIYCDAVPYLFKKVRAVYTEGGWFEEGMK 

LEAIDPLNLGNICVATVCKVLLDGYLMICVDGGPSTDGLDWFCY 

HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 

APSRLFT^DCPNHGFKVGMKLEAVDI^IEPRLICVATVKRVVHRL 

LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 

ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 

PQGARKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NIKQETDD 


6295 


354 


1814 


i kijKI VAbb VKWI P5FFPDLELYSCCLGTDRGFPELSHHC 

KNVIATAS DYDMAE I TNI R PS FDVS P WAGLIGAS VLWCVSVT 

VFVWSCCHQQAEKKHKNPPYKFIHMLKGIS I YPETLSNKKKI I K 

VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 

LPIKMDYGEELRSPITSLTPGE5KTTSPSSPSEDVMLGSLTFSV 

DYNFPKKALWTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 

VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 

RDDVIGEVMVPLAGVDPSTGKVQLTRDIIKRNIQKCISRGELQV 

SLSYQPVAQRMTVWLKARHLQKKDIAGLSGNPYVKVNVYYGRK 

RIAICKKTHVKKCTLNPIFNESFIYDIPTDLLPDISIBFLVIDFD 

RTTKNEWGRLILGAHSVTASGAEHWREVCESPRKPVAWfflSLS 
EY 




2795 


*17 


VS S ALLTGATS G3 DAAKS EGAS AS PLS CTNAVAMDR PDEG P PA K 
TRRLSSSESPQRDPPPPPPPPPLLRLPLPPPQQRPRLQEETEAA 
Q VLADMRGVGLG PAL P P P P P Y V I LE EGG I RAY FTLG AECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 
GALETCSAVGWAPQRLVDPKSKEEAIIIVEDEDEDERESMRS9R 
RRRRRRRRKQRKVKRESRERNAERMES ILQALEDI QLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDLI IQHI PGFWVKAFLNHPR 
IS ILINRRDEDI FRYLTNLQVQDLRHISMGYKMKLYFQTNPYFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
r r :> w t awHb bPhADR IAE 1 1 KNDLWVN PLR YYLRERGSRI KRKK 

QEMKKRKTRGRCEWIMEDAPDYYAVEDIFSEISDIDETIHDIK 
I SDFMETTDYFETTDNE I TD I NENI CDS ENPDHNEVPNNETTDN 
NESADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNENTY 
GNNFFKGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRDIEYYEKVIEDFDKDQADYEDVIEII 
SDES VEEEG I EEG I QQDED I YEEGNYEEEGSEDVWEEGEDSDDS 
DLED VLQ VPNGWANPGKRG KTG 


6296 


727 


1199 


RHCGCDAQGACDSLP PTGTSS PVTARNA I PEARCCVWLLDGTT V ' 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSLQFPSPFSGTISFGSFSDSGIFPLGSQCCLGFQQFSISGK 
KWAL I HKRVRLS VFGARWGRI YFGK 


6297 


1 


922 


QRAAAASPSSCGPRGAEYGAiMAMEGYWRFLALLGSALLVGFLS " 
VIFALVWVLHYREGLGWDGSALEFNWHPVLMVTGFVFIQGIAII 
V YRLPWTW KCS KLLM KS I HAG LNAVAAI LA IIS WAVF ENHNVN 
NI ANMYSLHS WVGLIAVI CYLLQLLSGFS VFLL PWAPLSLRAFL 
MP IH VYSG I V I FGTV I ATALMG LTEKL I FS LR D P A YSTFP PEGV 
FVNTLGLLILVFGALIFWIVTRPQWKRPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNNEVAARKRNLALDEAGQRSTM 


6298 


3 


985 


S VPLRRLS LS GTLQGAGTTTKMAVAR LAAVAAWVPCRS WG W AA V 
P FGPHRGLS VLLAR I PQRA PR WLPACRQ KTSLSFLNR PDL PNLA 
YKKLKGKS PGIIFI PG YLS YMNGTKALAI EEFCKSLGHACIRFD 
YS GVGS S DGNS E ES TLGKWRKD VLS 1 1 DDLADGPC; I LVGS S LGG 
WLMLHAA I AR PE KWAL I G VATAADTLVTKFNQLP VELKKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIiCEAEHHCLLHSPIPVNCPIRL 
LHGMKDDIVPWHTSMQVADRVLSTDVDVILRKHSDHRMREKADI 
QLLVYTIDDLIDKLSTIVN 
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(A«Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
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Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6"299 


512 


814 


BC^LEGIMPNVTISl^LPTNGSPi^DILVHPCVTSLDSAiLTSS 
S I DAMDDS AFSGPYKFPFTPPLES FNLCFYTSQVPVPP I LG FYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPSCWSQRGVPAAGTPSSPRLLVSRAAAPSAGPW3AWRQGARA 
AQSP7SIPNSSSVPYGSQDSVHSSPSDGGGGRDRPVGGSPGGPR 
LVIGSLPAHLSPHMFGGPKCPVCSKFVSSDEMDLHLVMCLTKPR 

ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


G KF VP VN W E P PQ PL F P P KY LR C YRCLLETKELGCLLGS DI CLTP 
AGSSC I TLH KKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGFW 
I FSQ YCFLD FCNDPQNRG L YTP 


6302 


490 


745 


I FGFLHLFHNEHSFLLVCALFAHVFFSSSCGSSVALHSDPCLLS '" 
PVLLNCLPGDLRPLDELYAQKLKYKAI SEELDHALNDKTS L 


6303 


2 


1961 


YWNEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL 
YW YYLEQFQYW EAQGWTFDASQS C DTDTYTS KTEADD KNDE KCM 
KVDLVS FLSS PI MGDNDSSGTSDKDHS EILDGI SN I KLNS EE VT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRKGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PS KLKRS HELD I DENPASDFDDSGSLLGFKYGSGQKYGG I PNFS 
HROVRYLEK^ T ViCLKSKYLD^IRRQIKMKNKHIFFTKESEKPFFKK 
S K I LS KVEKFLTWVNKPMDEEASQESSSHDNGHDAS TSCDS EEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYfiRDSLLATV 
PDEODCVTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDGI KLDREGWFSVTPEKIAEH IAGR 
V^SQSFKCDVVVDAFCGVGGNTIQFALTGMRVIAIDIDPVKIALA 
RNNAE VYG I AD K I E FI CGDFLLLAS FL KAD WFLS PP WGG PD YA 
TAETFDIRTMMSPDGFEIFRLSKKITNNIVYFLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


HRAR VDRS RES PGGDLRHPGRVRRD I TLSGH PR LSTQH VVLLRE 
DEVGDPGTKDLGHPQHGSPIQETQSEWTLVSPLPGSDMAALPA 
WRATSGLTLWPHTAEGRDLLGAENRALTGGQQAEDPTLASGAYQ 
WPGSVEKLQGSVWCDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 
AQGEWDKARVPAHGQVLQVGFSTEAALQDLSSPRLSQLCSQGL 
CGLIKRPGDLPEVLSFHVDRVLGLRRSLPAVARRFHSPLLPYRY 
TDGGAR P VI WW APDVQHLS D PDEDQNS LALGWLQYQALLAHSCN 
MPGQAPCPGIHHTEWARLALFDFLLQVHDRLDRYCCGFEPEPSD 
PCVEE RLREKCRNPAELRLVH I LVRS SDPSHLVYI DNAGNLQH P 
EDKLNFRLLEGIDGFPESAVKVLASGCLQNMLLKSLQMDPVFWE 
SQGGAQG LKQVLQTLEQRGQ VLLG HI Q KHNLTL FRDE DP 


63 05 


93 


420 


NM I WRGRS TYR PR PRRS VPP PELIGPMLE PGDEEPQ QEEPPTES 
RDPAPGQEREEDQGAAETQVPDLEADLQELSQSKTGDECGDGPD 
VQGKILTKSEQFKMPEGR 


6306 


. 1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATR V PNDCLGEAL I NGD VPME EATDFS BAD KS ALM D 
ESSDSGVI PGS HS Eft ALHAS EE E EG EGG KAQSS LGY I PLMRWQ 
SVRHTTRKSSTTLREGWWHYSNKDrLRKRHYWRLDCKCITLFQ 
NNTTNRYYKEIPLSEILTVESAQNFSLVPPGTNPHCFEIVTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQASLSISVSNSQIQENVDIATVYQIFPDEVLGSGQF 
GVVYGGKHRKTGRDVAVKVIDKLRFPTKQESQLRNEVAILQSLR 
HPGIVNLECMFETPEKVFWMEKLHGDMLEMrLSSEKGRLPERL 
TKFLITQILVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFAR I IGEKS FRRS WGTPAYLAPEVLLNQGYNRSLDMWS VG 
VIMYVSLSGTFPFNEDEDINDQIGNAAFMYPASPWSHISAGAID 
LINNLLQVKNRKRYSVDKSLSHPWLQEYQTV7LDLRELEGKMGER 
Y ITHES DD AR W EQFAAE HPLPG SGL PTDRDLGGACPPQDHDMQG 
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Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
LAERISVL ~ 


6307 
ci no 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKVVRQSKFRHVFG 
Q P VKNDQ CYED I RVS R VTWDS T FCAVNP KFLAVI VEASGGGAFL 
VLPLS KTGR I DKAYPTVCGHTG PVLDIDWCPHNDEV I ASGS EDC 
TVMVWQ I P ENG hTS PLTE P VWLEGHTKRVG I IAWHPTARNVLL 
SAGCDNVVIiIT'nJVGTAEELYRLDSLHPDLIYNVSWNHNGSLFCS 
ACKDKS VR I IDPRRGTLVAERBKAHEGARPMRAI FLADGKVFTT 
GFSRMSERQLALWDPENLEEPMALQELDSSNGALLPFYDPDTSV 
VYVCGKGDSSlRyFEITEEPPYIHFLNTFTSKEPQRGMGSWPKR 
GLEVS KCEIARFyKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PFAALEAEEWVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
D5RPAMAPGSSHLGAPASTTTAADATPSGSLARAGBAGKLEEVM 
QELRALRALVKEQGDRICRLEEQLGRMENGDA 


djuo 


2 


1118 


GRPTRPEKMLLSLVLHTYSMRYLLPSWLLGTAPTYVLAWGVWR 
LLSAFIiPARFYQALDDRLYCVYQSMVIjFFFENYTGVQILIjYGDL 
PKNKENIIYLANHQSTVDWIVADILAIRQNALGHVRYVLKEGLK 

wlplygwyfaqhggiyvkrsakfnekemrnklqsyvdagtpmyl 

VIFPEGTR YNPEQTKVLS ASQAFAAQRGLAVLKHVLTPRI KATH 

vafdcmknyldaiydvtw.yegkddggqrresptmteflckecp 

KIHI H IDRI DKFCDVPEEQEHMRRWUIERFEI KDKML IE F YES PD 
PERR KRFPGKS VNS KLS X KKTLPSML I LSGLTAGMLMTDAGRKL 
YVNT W I YGTLLGCLWVT I KA 


6309 


220 


563 


L VAEVKE PCS LPMLS VDMENKENGS VG VKNSMENGRP PD P AD WA 
VMDWNYFRTVGFEEQASAFQEQEIDGKSLLLMTRNDVLTGLQL 
KLGPALKI YE YHVKPLQTKHLKNNS S 


6310 


36 


979 


GPRCWKFLILSSVNCETLRIGKAWPQSSGQERYWTPRTHSSASE 
AQRGS LA3LNVAAAGLWADCDQPLYDCPMCGLI CTOYHILQEHV 
DLHLEENSFQQGMDRVQCSGDLQIiWIQLQQEEDRKRRSEESRQE 
IEEFQKLQRQYGLDNSGGYKQQQLRNMEIEVNRGRMPPSEFHRR 
KADMMESLALGFDDGKTKTSGI I EALHR YYQNAATD VRR VWLS S 
WDHFHSSLGDKGWGCGYRNFQMLLSSLLQNDA YNDCT.iKGML I P 
CIPKIQSMI EDAW KEGFD PQGAS QL 1 1 RLGGTKAW IGACE VY I L 
LTSLRV 


6311 


1 


675 


PVWWNSCEGPRIJUUVAJlTGHGVGRRARLACJuGEPRVKAAVKLTL 
ASKLKRDDGLKGSRTAATASDSTRRVSVRDXLLVKEVAELEANL 
PCTCKVHFPDPNKLHCFQIiTVTPDEGYYQGGKFQFETEVPDAYN 
MVPPKVKCLTKIWHPNITETGEICLSLLREHSIDGTGWAPTRTL 
KDWWGLNSLFTDLLNFDDPLNI EAAEHHLRDKEDFRNX VDD Y I 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKWLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTE EEAKQLAEEMN I A F YTS RTDD I LLHQD VD LVC I S I PP PLT 
RQISVKALGIGKNVVCEKAATSVDAFRMVTASRYYPQLMSLVGN 
VLR FLPAFVRMKQL I S EHYVG A VM I CDAR I YS GSLLS P S YGW I C 

DBLMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
I RG I RH VT.S DDFPP pAMr Mrzrzrz vr*c tiptt tuu*vtm rv» nntnTm «« n , 
v 101/ur v»r r s/nuriuvju vua i. v X JLiDrNMfc'GAF VHEVMW 

GS AGRL VARGADLYGQ KNS ATQEE LLLR DS LAVGAG LP EQGPQD 

VPLLYLKGMVYMVQALRQSFQGQGDRRTWDRTPVSMAASFEDGL 

YMQSWDAI KRSSRSGEWEAVEVLTEEPDTNQNLCEALQRNNL 


6313 


2 


2071 


QRSGAARLAFLPSP FS PACVHRS PLS FHGCWFYF VWFMPLG VL 
FIIRRRAHGCTLS CSS F VEQPTAMEAE ETM ECLQE FPEHHKM I LD 
RLNEQ REQDR FTDI TL I VDGHH FKAHKAVLAACS KFF Y KF FQEF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKLMIQGEEEANDVWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNBAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEG I E I VELQ LS HVKDL FHCE KCNRS FKLF YHF KE HM KSHS TE 
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P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
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S F KCE I CNKR YLRES AWKQHLNC YHLEEGG VS KKQRTG K K I H VC 
QYCEKQFDHFGHFKEHLRKHTGEKPFSCPNCHERFARNSTLKCH 
LTACQTG VG AKKGRKKLYECQVCNS VFNS WDQF KDH LV I HTGD K 
PNHCTLCDLW FMQGNE LRRH LS DAHN I SERLVT EE VLS VETR VQ 
TE P VTSMT 1 1 EQVGKVHVLPLLQVQVDSAQVT VEQVHPDLLQDS 
QVHDS HMS EL P EQ VQ VS YLE VGRI QTEEGTE VHVE ELHVE RVNQ 
M PVE VQTE LLEADLDH VT PE I MNQEE RESSQADAAE AARE DH ED 
AEDLETKPTVDS E AEKAENEDRTALPVLE 


6314 


2 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFWVFMPLGVL 

FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 

RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 

TQE PL VE I EG VS KMAFRHL I E FTYTAKLMI QG E E EAND VW KAAE 

r uynucfti iuujJSi vkw KlsN&APbEENTTGKNEAKKRKI AETSNVI 

TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 

Y IQSTGS S DDSALALLAD I TS KYRQGDRKGQI KEDGCPSDPTSK 

QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 

SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 

QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 

LTACQTGVGAKKGRKKLYECOVCNSVFNSWDQFKDHLVIHTGDK 

PNHCTLCDLWFMQGNELRRHLSDAHNI SERLVTEEVLSVETRVQ 

x atr v a orj a 4. Aisy v U tWH V Ij FIjI^VQ VDSAQVTVEQ VHPDLLQDS • 

QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 

MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 

AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


; 1015 


I^l^\mVVTTLVLISYCPTATEEAPYWTYLLC^LGLFIYQSLDA 
IDGKQARRTNS CS PLGELFDHG CDS LS TVFMAVGAS I AARLGT Y 
' L " u A O *-° 17 -xorjr vr iuuiwyi x VoLjrjLiKrvaKVDVTEIQlALVI 
VFVLSAFGGATMWDYTIPILEIKLKILPVLGFLGGVIFSCSNYF 
HVI LHGG VGKNGST I AGTS VLS PGLHIGLIII LA I M I YKKS ATD 
VFEKHPCLYILMFGCVFAKVSQKLWAHMTKSELYLQDTVFLGP 
GLLFLDQ YFNN F I DE YWLWMAMV I S S FDM VI YFSALCLQISRH 
LHLNI FKTACHQAPEQVQVLSSKSHQNNMD 


6316 


1503 


792 


VSAGAGTGIMGGTTSTRRVTFEADFNPNTTWtfrtTPT QPNnyTnrS — 
MXESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRA I LRERI CSEEERAKAKHL 
ARQLE EKDRVLKKQDAFYKEQLARLEERSS E FYRVTTEQ YQKAA 
EE VE AKFKR YE S HP VCADLQAKI LQCYRENTHQTL KCS ALATQ Y 
MHCVNHAKQSMLEKGG 


6317 


102 ; 


839 


PEAQTSAVLARBKGHLPTMRHEAPMQMASAQDARYGQKDSSDQN 
FDYMFKLLI IGNS S VGKTSFLFRYADDS FTSAFVSTVG I DFKVK 
TVFKNEKRIKLQIWDTAGOERYRTITTAYYRGAMfiPTT mvh T txt 
EES FNA VQD WS TQ I KT YS WDNAQ VILVGNKCDMEDE RV I STE RG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESIiETD 
PAITAAKQNTRLKETPPPPQPNCAC 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPkAEGREGADSMSHLPGLELRRBAPPL 
LGPLLSPFPLPAGSWHRQMLRSSLRFPITNSAGAPCKAAGRMNI 
I^PVRRDRVLAELPQCLRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFE VLG I PFSLQLWDTAGQERFKCI ASTY YRGAQAI 1 1 VFNLN 
DVASLEHTKQWLADALKENDPSSVLLFLVGSKKDLSTPAQYALM 
E KDALQVAQEMKAE Y WAVS S LTG ENVR EF FFRVAALT FEANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMR LNQNTLLLG KKWL VP YTS EHVPS R YHE WM KS E E LQR LT 
AS EPLTLEQEYAMQCS WQEDADKCTFI VLDAEKWQAQPGATEES 
CMVGDVNL FLTDLEDLTLGE I E VM I AE PS CRGKGLGTE A VLAM L 
S YG VTTLGLTK FE A K IGQGNE PS I RMFQ KLH FEQ VATS S VFQE V 
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TbRbTVSESBHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGREKVAMAAVDS FYLLYRE I ARSCNC YMEALALVGAWYT A 
RKS I TVI CDPYSLI RLHFI PRLGSRADLI XQYGRWAWSGATDG 
IGKAYAEELASRGLNI I L I SRNEE KLQVVAKD I ADTY KVETD 1 1 
VADFSSGREIYLPIREALKDKDVGILVNNVGVFYPYPQYFTQLS 
BDKLWDI INVNI AAASLMVHWLPGMVER KKGAI VTISSGSCCK 
PTPQLAAFSASKAYLDHFSRALOYEYASKRT pun^TiT pitwhtc 

MTAPSNFLHRCSWLVPSPKVYAHHAVSTLGISKRTTGYWSHSIQ 
Fli FAQ YM P E WLW WIGAN I LNRSLR KEALS CTA 


6321 


141B 


341 


HRKAALGALMAGRLLGKALAAVS LS LALAS VTI RSSRCRG IQAF 

RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 

VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNFSPK 

FNEKDGHVERKSKNGLYEIENGRPRNPAGRTGLVGRGLLGRWGP 

NHAADPIITRWKRDSSGMKIMHPVSGiCHlUJFVAIKRKDCGEWA 
IPGGMVDPGEKISATLKRKF^FRIXT.WCTJ^wxcnTlrDI?Tc•c , vt uv 

LFS QDHLV I YKGYVDDPRNTDN AWM ETEAVN YHDETGE I MDNLM 
LEAG DDAGKV KWVD I ND KLKL Y AS HSQFI KLVAEKRDAH W S EDS 
EADCHAL 


£322 


2047 


1083 


NQE ILKNVES SRTVQPHFLEFLLS LGWSVDVGRHPGWTGHVSTS 
n^AiiuwiAJiiuoyyciD v i jjtLLf± bAoJ. r NijQKrLvLiYi ADALiTEI 
AFWPSPVESLTDSLESNISDQDSDSNMDLMPGILKQPSLTLEL 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSWWVE 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTLEKEVPVIF 
IHPLNTGLFRIKIQGATGKFNMVIPLVDGMIVSRRALGFLVRQT 
VINICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PASTTDGAQE ARVPLDGAFW I PRPPAGSPKGCFAC VS KPPALQA 
PAAPAPEPSASPPMAPTLFPMESKSSKTDSVRAAGAPPACKHLA 
EKKTMTNPTTVI E VYPDTTEVNDYYLWS I FNFVYLNFCCLGFI A 
LAYS L KVRDKKliLNDLNG AVEDAKTDRL IN I TRS GLAAS C I MLW 
MALSVIATHRGLRSSASILVAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RPGPGAGAPAGR PEGGGPWARTEGSS LHSEPERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEAS PWTQPGVHG P 
WTELETHGSOTOPERVKSWAnMTiMTHnMCQCT nTUDcparocve 

PSADGSWKELYTDGSRTOODIEGPWTEPYTDGSQKKQDTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KT VLKYS PFWS FRKHYPWVQLSGHAGNFQAGEDGR I LKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLQDGQTFTCQMBDLLADFEGP 
SIMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 
EEHAQGAVTKPRYMQWRETMSSTSTLGFRIEGIKKADGTCNTNF 
KKTQALEQVTKVLEDFVDGDHVILQKYVACLEELREALEISPFF 
KTHEWGS S LLFVHDHTGLAKVWM I DFG KT VALPDHQTLS H RLP 
WAEGNREDGYLWGLDNMICLLQGLAQS 


6325 


165 


944 


GLRDP FRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPS TSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRS YSRS RS RSRSRRYRERRYG FTRRYYRS PSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRS RT P FRLS EKDRMELL E I AKTNAAKALGTTN I DLP AS LRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 


GEPS P ATQQ K PS ATG AG VLHQH FSSGH I YVLMGLL PPPWTISFT 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRTQTL 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
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corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, QsGlutamine, RoArginine, 
S=Serine, T=Threonine, WValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








QAWGGVGQEASSGVP 


6327 


1 


1337 


S LARIiA PAGGS WM PTQQPAAPSTRAPKPSRSLSGSLCALFS DA 
DSGSGM KAELPPGPGAVGRBMTKE BKLQLRKEKKQQKKKRKE EK 
GAEPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 
LRAERRAKQEAERALKQARKGEQGGPPP KAS PSTAGETPSGVKR 
LPEyPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RQNS LTQFMS I PSSVIHPAMVRXjGLQ YS QGL VRGSNAR CI ALLR 
ALQQVIQDYTTP PNEELSRDLVNKLK P YMS FLTQCRPLSASMHN 
AIKFLNKEITSVGSS KREEEAKSELRAAI DRYVQEKI VLAAQAI 
SRFAYQKISNGDVILVYGCSSLVSRILQEAWTEGRRFRVWVDS 
RPWLEGRHTLRSLVHAGVPAS YLIiI PAAS YVLPE VS TEEKDS KV 
GGEKV 


6328 


1030 


276 


HA3AE VTTAAARGLGAMEEEMHTDAKI RAENGTGSS PRGPGCS L 
RHFACEQNLLSRPDGSASFLQGDTSVLAGVYGPAJBVKVSKEIFN 
KATliEVILRPFCIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSI 
TWLQVVSDAGSLLACCLNAACMALVnAGVPKRALFCGVACALD 
S DGTL VLD PT S KQE KE ARAVLT FALDS VERKLLM S S T KGLYS DT 
ELQQCLAAAQAASQH VFRF YRES LQRRYS KS 


6329 


3 

- 


2016 


S S E VAAGGGTRSAMAEGSGE WTVSATGAANGIiNNGAGGTSATT 
SNPtiSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDIERKSLAINEEFVSIFKEVKEELESISEDVQAMSNCCQDMT 
SRLQAAKEQTQDLIVKTTKLQSESQKLEIRAQVADAFLS KFQLT 
SDEMSLLRGTREGP I TEDFFKALGRVKQIHNDVKVLLRTNQQTA 
GLEIMEQMALLQETAYERLYRWAQSECRTLTQESCDVSPVLTQA 
ME ALQDRP VLYKYTLDE FGTARRSTWRGF I DALTRGGPGGT PR 
PIEMHSHDPLRYVGDMLAWLHQATASEKEHLEALLKHVTTQGVE 
ENIQEWGHITEGVCRPLKVRIEQVIVAEPGAVIiLYKrSNLLKF 
YHHTISG I VGNSATALLTTI EEMHLLSKKI FFNSLSLHASKLMD 
KVELPPPDLGPSSALNQTLMLLREVliASHDSSVVPLDARQADFV 
QVLSCVLDPLLQMCTVSASNLGTABMATFMVNSLYMMKTTLALF 
E FTDRRLEMLQ FQ I EAH LDTI* I N E QAS YVLTRVGLS Y I YNTVQQ 
HKPEQGSIiANMPNLDSVTLKAAMVQFDRYLSAPDNLLlPQLNFL 
LS ATVKEQ I VKQSTELVCRAYGE VYAAVMN PINEYKDPENILHR 
SPQQVQTLLS 


6330 


1151 


333 


FFYYTFYBNKTFSRKMVAEKETLSIiNKCPDKMPKRTKLLAQQPL 
PVHQPHSLVSEGFTVKAMMKNSWRGPPAAGAFKERPTKPTAFR 
KFYERGDFPIALEHDSKGNKIAWKVEIEKLDYHHYLPLFFDGLC 
EMTFPYEFFARQGIHDMLEHGGNKItiPVLPQLI I PIKNALNLRN 
RQVICVTLKVLQHLWSAEMVGKALVPYYRQILPVLNIFKNMNV 
NSGDGIDYSQQKRENIGDLIQETLEAFERYGGENAFINIKYWP 
TYESCLLN 


6331 


3 


495 


QQGQRVRTRGRRACASATPLiEGCYDLSYPRTHAALLKVAQMVTL 
LIAFICVRSSLWTNYSAYSYFEVVTICDLIMILAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNQSGLVAGA 
I FGFMATFLCMASIWLS YKI SCVTOSTDAAV 


6332 


1 


878 


.VTESNKFDLVSF1PLLRERIYSNU0VARQFIISWILVLESVPDI 
NLLD YLPE I LDGLFQI LGDNGKE I RKMCEWLGE FLKE I KKNPS 
SVKFAEMANILVIHCQTTDDLIQLTAMCWMREFIQIiAGRVMLPY 
SSGI LTAVLPCLAYDDR KKS I KEVANVCNQSLMKLVTPEDDELD 
BIiRPGORQAEPTPDDALPKQEGTASGEWTPSLHLTSCRGPREPD 
VIG VALGP H LS NQD Y FM YVTHT I VAATQRSGSS GS PP FCRQDTG 
KLSTMATHSQLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSLLAPLIPPRSAG 
QPLTFSPSGRQPLRSLLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
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nucleotide 
location 
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Amino acid segment containing signal peptide 
(A=Alanine, C^Cyeteine, D^Aspartic Acid; E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=ljysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, +«Stop 
Codo.n, /=possibl e nucleotide deletion, 
\=possible nucleotide insertion) 








G AKS MWTEHKS PDGRTYY YNTE T KQSTWB K PDDLKT P AEQ LLS K 
CPWK3YKSDSGKPYYYNSQTKESRWAKPKELEDLEGYQNTIVAG 
SLITKSNuHAMIKAEESSKQEECTTTSTAPVPTTEIPTTMSTMA 
AAEAAAA WAAAAAAAAAAAAANANASTS AS NTVSGT V P WPB P 
EVTS I VATWDNENTVTISTEEQAQLTSTPAIQDQSVEVS SNTG 
EETSKQETVADPTPKKEEEESQPAKKTYTWNTKEEAKQAFKELL 
KBKR VPSNAS WEQAMKMI IND PRYS ALAKLS EK KQAPNAYKVQT 


6334 


. 17 


644 


GGNPSGRAAGFAAAAMPSSPLRVAWCSSNQNRSMEAHW1LSKR" 
GFSVRSFGTGTHVKLPGPAPDKPlTirYnFjrrTvnnviVKmT t ovni/ 

ELYTQNGII.HMLDRNKRIKPRPERFQNCKDLFDLILTCEERVYD 
QWEDLNSREQETCQPVH WNVDIQDNHEEATLGAFL ICE LCQC 
IQHTEDMENEIDELLQEFEEKSGRTFLHTVCFY 


6335 


82 


529 


AARAR PG VLCCR IiLiGAALGDO S R VP M *? Y T Pfin Pirra\n/n b v c t u 

KLRQGENLILGFSIGGGIDQDPSQWPFSEDKTDKGIYVTRVSEG 
GPAE I AGLQIGDKI MQVNGWDfOTMVTHDQARKRLTKRS EEWRL 
LVTRQSLQKAVQQSMLS 


6336 


1003 


438 


HEPASKGRAEVGNMRLSVAAAISHGRVFRRMGLGPESRIHLLRN 
IiTGLVRHERIEAPWARVDEMRGYAEKLIDYGKLGDTNERAMRM 
ADFVJLTEKDLIPKLFQVLAPRYKDQTGGYTRMLQIPNRSLDRAK 
MAV I E YKGNCL PPLPL PRRDSHLTLLNQLLQG LRQDLRQ SQEAS 
NHSSHTAQTPGI 


6337 


76 


524 


EGIQMLSVQPDTKPKGC^GCNR^IKDRYLLKALDKYWHEDCLKC - 
ACCDCRLGEVGSTLYTKANLILCRRDYLRI.FfJVTrrNrrAarcwT t 

PAFEMVMRAKDNVYHLDCFACQLCNQRFCVGDKFFLKNNMILCQ 
TDYB EGLMKEG YAPQVR 


6338 


66 


^~ 1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP' 
GLRLALLLLLGLGTPECSGVQGQEGLDFPEYIIGVDRVINVNAKNY 
KNVF KKYE VLALLYHE PP EDDKASQRQ FEMEEL I LELAAQ VLED 
KGVG FGLVDSE KDAAVAKKLG LTE VDSM YVFKGDE VI EYDGEFS 
ADTIVEFLLDVLEDPVELIEGERELQAFENIEDE IFCLIGYFKS K 
DSEH YKAFEDAAEEFHPYI PFFATFDSKGAKKLTLKLNE I DFYE 
AFMEEPVTIPDKPNSEEEIVNFVEEHRRSTLRKLKPESMYETWE 
DDMDGIHI VAFAEEADPDGFEFLETLKAVAQDNTENPDLS I IWI 
DPDDFPLLVTYWEKTFDIDLSAPQIGVVNVTDADRLWMEMDDEE 
DLP5 AE E LE DWLED VLEGE INTBDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHS FFSQGAMKAFH ' 

TFCWLLVFGSVSEAKFDDFEDEEDIVEYDDNDFAEFEDVMEDS 

VTESPQRVIITEDDEDETTVELBGQDENQEGDFEDADTQEGDTE 

SEPYDDEEFEGYEDKPDTSSSKNKDPITIVDVPAHLQNSWESYY 

LEILMVTGUAYI MNY I IGKNKNS RLAQAWFNTHRELLESNFTL 

VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCBGMLIQLRFL 

KRQDL LNVLARMMR P VS DQVQ I KVTWNDEDMDTYVFAVGTRKAL 

VRLQKEMQDLSEFCSDKPKSGAKYGLPDSLAILSEMGEVTDGMr4 

DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPLXLPDTKR 

TLLLTFNVPGSG1ITYPKDMEALLPLMNMVIYSIDKAKKFRLNRE 

GKQKADKNPJVRVEENFLKLTHVQRQEAAQSRREEKICRAEKBRIM 

NEEDPEKQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 


6-340 " 


2 


583 


EAC7\HTLSCPAFARLGRARRRPWMSHRTSSTFRAERSFHSSSSS 
S S S STS SS AS RALPAQD PPME KALS M FSDD ?GS FMR PHS EP LAF 
PARPGGAGNI KTLGDAYEFAVDVRDFS PEDI I VTTSNNHIEVRA 
EKLAADGTVMNNFAH KCQL.PEDVDPTS VTSALREDGS LTIRARR 
HPHTEHVQQTFRTEIKI 


i 6341 


2 


645 


kMAVLSAPGLRGFRILGLRSSVGPAVQARGVHQSVATDGPSSTQ 
PALPKARAVA P KPSS RGE YWAKLDDL VNWARRSS LW PMTFGLA 
CCAVEMMHMAAPRYDMDRFGWFRASPRQSDVMIVAGTLTNKMA 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H^Histidine, I-Isoleucine, K^Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, . 
W=Tryptophan, Y«Tyrosine, X=Unkn own. *=Stop 
Codon, /=possible nucleotide deletion. 
\=possible nucleotide insertion) 








PALRKVYDgMPEPRYWSMGSCANGGGYYHYSYSWRGCDRIVP 
VDIYIPGCPPTAEALLYGILQLQRKIKRBRRLQIMYRR 


6342 


2 


1191 


DPRVRAMLATLARVAALRKTCLFSGRGGGRGLWTCRPQSDMNNI 
KPLEGVKILDLTRVLAGPFATMNLGDLGAEVIKVERPGAGDDTR 
TWGPPFVGTESTYYLSVNRNKKSIAVNIKDPKGVKIIKELAAVC 
DVFVENYVPGKLSAMGLGYEDIDEIAPHIIYCSITGYGQTGPIS 
QRAGYDAVASAVSGLMHITGPBVACLSHIAAIJYLIGQKEAKRWG 
TAHGSIVPYQAFKTKDGYIWGAGNNQQFATVCKILDLPELIDN 
SKYKTNHLRVHNRKELIKILSERFEEELTSKWLYLFEGSGVPYG 
P1NNMKNVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
M S EARP P PLLGQHTTHI LKE VLRYDDRAIGELLSAGVVDOHETH 


6343 


2 


93 6 


GTAMVSDEDELNLLVIWDANPIWWGKQALKESQFTLSKCIDAV 
M VLGNSHLFMNRSNKLAVT AS HI Q ES R FLYPG KNGRLGD F FGD P 
GNPPEFNPSGS KDGKYELLTSANEVI VEEIKDLMTKSDI KGQHT 
ETLIAGSIAKALCYIHRMNXEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMNVIFAAQKQNILIDACVLDSDSGLLQQACDITGGIiYIiKV 
PQMPSLLQYLLWVFLPDQDQRSOLILPPPVHVDYRAACFCHRNL 

IBIGYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAKKKKLK 
VSA 


6344 


2508 


147 


TMPTATLGNLRGYGMAS PGLAhPSLTP PQLATPNLQQFFPQATR 
QSLLGPPPVGVPMNPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRLRSSEEPTEKEPPGQLQVKAQPOARMT 
VPKQTQTPDLLPEALEAQVLPRFQPRVLQVQAQVQSQTQPRIPS 
TDTQVQPKLQKQAQTQTSPEHLVU3QKQVQPQLQQ3AEPQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQIiQKQVQTOTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQPQVSLLAPEQTPVWHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVSMEEIQNESACGLDVGECENRAREMPGVWGAGGSLKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKOALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRLGEIQHMSQACLLSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKSQGHKDKAKBLKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEBEIBVEEBLCJCQVRSRDIS 
REEWKGSETYSPNTAYGVDFLVPVMGYICRICHKFYHSNSGAQL 
S HCKS LGH FENLQKY KAAKN PS PTTRP VS RRCA INARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


6345 


2 


3483 


PRVRTKLXIiLVNDKKRYERVGGGPKRU3RDVEMEEMIEQLQEkv 

HELEKQNDTLKNRLISAKQQLQTQGYRQTPYNNVOSRINTGRRK 

ANENAGLQECPRKGIKFQDADVAETPHPMFTICYGNSLLEEARGE 

IRNLEWVIQSQRGQIEELEHIiAEI LKTQLRRKENE I ELSLLQLR 

EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 

QRTLKISHBALMANGDELNMQLKEQRLKCCSLEKQLHSMKFSER 

RIEELQDRINDLEKERELLKENYDKLYDSAFSAAHEEQWKLKEQ 

uuivv v^-"V*jci i/UjKb uu I UKTEI IiDRLKTERDQNEKLVQENREL 

QlfQYLEQKQQLDBLKKRIKLYNQENDINADELSEALLLIKAQKE 

QKNGDL3FLVKVDSEINKDLERSMRELQATHAETVQELSKTRNM 

LIMQHKINKDYQMBVEAVTRKMENLQODYELKVEQYVHLLDIRA 

ARIHKIiEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETlHLERG 

ENLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 

VRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITLEVHQAYSTEYE 

TIAACCLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 

RVPMDQAIRLYRERAKALGYITSNFKGPEHMQSLSQQAPKTAOL 

SSTDSTDGNLNELHITIRCCNHLQSRASHLQPHPYWYKFFDFA 

DHDTAI I PSSNDPQFDDHMYFPVPMNMDLDRYLKSBSI.S FYVFD 

DSDTQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D-Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown. *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNFIRSEEPEWQRLPPASSVST 
LVIiAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQEGSVDEVKEN 
TEKMQQGKDDVSLLSEGQLAEQSLASSHHKTEITEDLEPEVEED 
MSASDSDDCIIPGPISKNIKQPSEKIRIEIIALSLNDSQVTMDD 
TIQRLFVECRFVSLPAEETPVSLPKPKSGQWVYyNYSNVIYVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAH VDLADMFQEGRDL I EQNIDVPDARADGEG IGKLR VTVEAL 
HALQ3VYKQYRDDLEA 


6346 


2921 


533 


ODRRLLRLELQKTCQPTSTMSGSKTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAAC LLL FLEEE DA FWMMS A 1 1 EDLLPAS YFSTTLLG VQ 
TDQRVLRHLIVQYLPRLDKLU)EHDIEI£LITLHWFLTAFASVV 
uxrajuuKi mJLii? t YEGSRVIjFQLTIjGMLHLKEEELIQSENSASI 
FNTLSDIPSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHEMYVACSRSH 
RR RAKALLDFERHDDDELG FRKND 1 1 TI VS QKDEH CWVGELNGL 
RGWFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
I CVGLNEQ VLH LW L E VLCS S LPTV EKW YQPWS FLRS PGWVQI KC 

ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6347 


2921 


533 


QDRRLLRLEIjQKTCQPTSTMSGSHTPACGPFSALTPS I HPQE I L 

AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 

DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVS lprs eklrs lvla 

GIPHGMRPQLWMRLSGALQKKRNSEliSYREIVKNSSNDETIAAK 

QIEKDLLRTMPSNACFASMGSIGVPRLRRVLRALAWLYPEIGYC 

QGTGMVAACLLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 

TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASVV 
DIKLLLRI WDLPPYPnQOVT j?r\z tt y^mt ut ynpuy T/ -. _ n 
w*4WMU(\*ni/ur r i ovron vur liunJjHJUlUSaEJLiXuSENSAS I 

FNTLS DI PSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGI* 
RGWFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGL KKPS LLGG ACH PWL F I E E AAGRE VERDFAS VYSR LVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
I CVGLNEQ VLHLWLE VLCSS L PT VE KW YQ P WS FLRS PGWVQI KC 

ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6346 


3 


3679 


AGAEKCFVTLUACFLAKQQNKYKYEECKDLIKSMiRNELQFKEE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHG PCDS NQ PHKNI KI TF EE DE VNSTL WDRES S HDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
E KKQQFRNLKE KCFLTQLACFLANQQN K YK YE ECKDL I KFM LRN 
ERQ FKE E KLAEQLKQAE ELRQ YKVL VHSQERELTQLR E KLR EGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

co r r e spond 1 ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amxno acid segment containing signal peptide"! 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, ( 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q»Glutamine, R^Arginine, 
SaSerine, Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








ECAITC5NSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI IPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQ P YRSAF YVLEQQR VGLAVNMDE I E KYQE VE EDQDPS CPR LS R 
ELLDEKEPEVLQDSLGRCYSTPSGYLBLPDLGQPYSSAVYSLEB 

QYLGl»ALDVDRIKKDOEEEFTXV5PPr , T>I7T.CD1?T T Mmrneiirnn 1 

S LDR CYSTPS S CLE OP DS CQP YGS S F YA LEE KH VGFS LD VGB I E 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PBVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEfCH 
VGFS LDVGE I E KKGKGKKRRGR RS K KERRRGR KEGEEDQN P PCP 
RLNSMLMEVEEPEVIjGDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEH IS FALYVDNR FFTLTVTS LHLVFQKGVI FPQ 


6349 


3 

• 


3^79 


AUAKKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE | 

KLAEQLKQABELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EI ILQALLTPD E P DKS QGQD LQEQLAEGCRLAQHL VOKLS P ENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 

ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 

EKKQQFRNLKEKCFLTQIACFLANQQNKYKYEECKDLIKFMLRN 

E RQ FKE EKLAEQL KQAE ELRQY KVLVHS QERELTQLREKLR EGR 

DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 

LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 

ECAITCSNSHGPYDSNQPHRKTKITFEEDFCVDSTLIGSSSHVEW 

EDAVHI IPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHBATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQP YRS AFYVLEQQRVGLAVNMDE I EKYQE VEEDQDPS CPRLS R 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

OYLGLALDVDRIKICDnRUlRRnnnDDPDDT ODPT T rmremnirr *>v*%. 1 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALBEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEBDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEH I S FALYVDNRFFTLTVTSLHLVFQMGVI FPQ 


6350 


3 


3679 


AGAE KCFVTLLACFLAKQQN K Y K YE ECKDL I KSMLRNBLQFKEEl 
KLAEQLKQAEELRQYKVLVHSQERELTQLRBKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNI KI TFEEDEVNSTLWDRESSHDECQDAIiN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
E KKQQFRNLKEKCF LTQLACF LANQQNKYKYEECKDL I K FM LRN 
E RQ FKE E KLAEQLKQAEELRQ Y K VLVHSQE R E LTQLR EKLREGR 
DASRSLNEHLQALLTP DE P DKS QGQDLQEQLAEG CRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E = 
Glutamic Acid, F=Phenylalanine, G=Glycine 
H=Histidine, I=Isoleucine, K=Lysine, 
L=» Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=«Valine, • 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6351 






ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW " 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YS TLS I P P EMLAS Y KS YSST FHS LEEQQ VCMAVD IGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQ P YRS AFYVLEQQR VG LA VNMDE I E KYQE VE EDQD P S CPRLS R 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
v i lAJijmjifvuKiKRuyBEEEDQGP PCPRLSRELLEWEPEVLQD 
SLDRCYSTPSSCLEQPDSOQPYGSSFYALEEKHVGFSLDVGEIE 
KKGKGKKRRGRRS KKERRRGR KEGEEDQN P PCPRLSRELLDE KG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQD PS C PRLS RE LLDE KE P EVLQDS LG RC YST PSG YLE LP DLGQ 
PYSSAVYSLEEQYLGLALDVDR I KKDQEEEEDQGPPCPRLSREL 
iiiiVvbPisvijQDSIjDRCYSTPSSCLEOPDSCQPYGSSFYALEEKH 
VGFSLDVGE I EKKGKGKKRRGRRS KKERRRGR KEGEEDQN PPCP 
RLNSMLMEVBEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEEEHIS FALYVDNR F FTLTVT S LHLVFQMG VI F PQ 




129.1 


319 


KKARRRTERSQLGRMLWEVANGRSLVWGAEAVQALRERLGVGG 
RTVGALPRGPRQNSRLGLPLLLMPEEARLLAEIGAVTLVSAPRP 
UbKHHbLALTS FKRQQE ES FQEQS ALAABARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKEDETSDGQASGEQEEA 
G PS S S Q AG P S N G VA PLPRS ALLVQ LATAR PRP VKAR PLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHYIAQCWAPBDTIPLQDLVAAGRLGTSVRKTLLLCSPO 

PDfilfWYTCT rktJRCTJ^ 


6352 
6353 


235 


923 


WSEWLSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP 
AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 
LMGNMNPEGGVNHENGMNRDGGMI PEGGGGNQE PRQQPQPPPEE 
tnynm i^vyryr army f k i KK 1 K.r rJjLQVEELES VFRHTQYPDVP 

TRRELAENIX5VTEDKVRVWFKNKRARCRRHQRELMLANELRADP 
DDCVYIWD 


"6354 


65 


672 


KKAGAGAI PEARARPPbVQAAEEEKEMULPDSASRVFCGRlLSM 
VNTDDVNAIILAQKNMLDRFEKTNEMLLNFNNLSSARLQQMSER 
FLHHTRTLVEMKRDLDS I FRRIRTLKGKLARQHPEAFSHIPEAS 
FLE EE DEDP I PPS TTTT I ATS EQS TGS CDTS PDTVS PSLS PG FE 
DLSHVQPGSPAINGRSQTDDEEMTGE 


~ 6355" ■ 


965 


510 


FSLRPMEPTKLJL'PL t'GGAF^S Alt LPMGA I D VS DLRP VPDNQE VFC 
HPVTDQSLIVELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 

SVQPLSLENLALRGRCQEAWVLSGKQQIAKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 




158 


1662 


^ uou "^^ / w^'vKVjjir'nUKvjKurijLlRRPGTRRGGFSLD 
WDGKVS E I KKKI KS I LPGRS C DLLQDTSHL P P EHS D WI VGGG V 
LG LS VA Y WLKKLE S RRGAI R VL WERDHTYS QAS TGLS VGG I GQ 
QFSLPENIQLSLFSASFLRNINEYLAWDAPPLDLRFNPSGYLL 
LASEKDAAAMESNVKVQRQEGAKVSLMSPDQLRNKFPWINTEGV 
AliASYGMEDEGWFDPWCLLQGLRRKVQSLGVLFCQGEVTRFVSS 
SQRMLTTDDKA WLKR I HE VHVKMDRS LEYQPVECA I VI NAAG A 
WS AQ IAALAG VGEG P PGTLQG TK LP VE P RKR YVYVWHC PQG PG L 
ETPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVWPHLALRVPAFETLK VQS AWAG YYDYNT FDQNG WG PH 
PLWNM Y FATGFSGHGLQQAPGIGRAVAEI4VLKGR FQTIDLS P F 
LFTRFYLGE KI QENNI I 


6356 


354 


633 


TGLTSSCLPLQVMMTKRTKDMGKFSSVTVSTIDEEEBEIEAREV"" 
ADSYAQNAKVI EKQLERKGMS KRRLQELAEL3AKKAKMKGTLID 
NQFK 



492 



WO 01/53312 



PCT/USOO/34263 



SEQ 
10 

NO: 


Predicted 
beginning 
nucleotide * 
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amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine. C=Cysteine, O-Aepartic Acid, 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possiblc nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRNQTSISQWVPVCSRLIPVSPTQGOGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
QPVEEKVGAFTKIIEAMGFTGPLKYSKWKIKIAALRMYTSCVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWNCLVRMKQEGRSGKYM 
CRI IVHFMWEDVQORGRVMGVNPyiLKKNMILMTNHFYAAlLGY 
DEGILSDDHGLAAALWRTFFNRKCEDPRHLELLVEYVRKQIQYL 
DSMNGEDLLLTGEVSWRPLVEKNPQS1LKPHSPTYNDEGL 


6358 


2009 

• 


1040 


ASDALHSLSAPVLRI^SRSAARPATMTEQAISFAKDFLAGGIAA 
Al SKTAVAP I ER VKLLLQVQHAS KQI AADKQYKG 1 VDCI VRI PK 
EOGVLSFWRGNLANVIRYFPTQAliNFAFKDKYKQl FLGGVDKHT 
QFWRYFAGNLASGGAAGATSLCFVYPLDFARTRLAAJDVGKSGTE 
RE FRG LGDCLVK I TK S DG I RGLYQG FS VS VQG 1 1 IYRAAYFGVY 
DTAKGML PD P KNTH I WS WMIAQTVTAVAG VVS Y PFDT VRRRMM 
MQS GR KGAD I M Y TGTVDC WRK I FRD EGGKAFFKG AWSNVLRGMG 
GAFVLVLYDEL KKVI 


6359 


98 


1086 


VCRQEEEKMKEDCLPSSHVPISDSKSIQKSELLGLLKTYNCYHE 
GKS FQLRH R E E EGTL 1 1 EG L LN I A WG LRRP I RLQMQDDREQ VHL 
PSTS WM PRRPS CPbKE PS PQNGN-I TAQGPS IQPVHKAES STDSS 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQR I RRHRFS 
1NGHFYNHKTSVFTPAYGSVTNVRVNSTMTTLQVLTLLLNKFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCEKIARIF 
LMEADIX3VEVPHEVAQYI KFEMPVLDSFVEKLKEEEEREI I KLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEEWLPPRSCRVFW IHSGTTMSKVS FKITLTSDP , 
RLPY KVLS VPE STPFTAVLKFAAEEFKVPAATS AI ITNDG IGI N 
PAQTAGNVFLKHGS ELR 1 1 PRDRVGSC 


6361 


615 


158 


RPGLGQLQHCALAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ 
FKDTLNTPLPDSS P VAVPLGAPI AVASTLSVEHNDGVETGI WAC 
APGRWRRQITSQEFCHF1QGRCTFTPDDGETLHIQAGDALMLPA 
NSTG I WDIQETVRKTYVLIL 


6362 


350 


1576 


TTMDGSHSAAliKLQQLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWSIALKLQKYCHIRIiAGSKDPRAYFKTKTWWLGLFLMLLG 
ELG VF AS Y AFAPLS L I VPLS AVSVI AS AI IG 1 1 F I XEKWK P KD F 
LRRYVLSFVGCGLAWGTYLLVTFAPNSHEKMTGENVTRHLVSW 
PFLLYMLVEIILFCLLLYFYKEKNANNIWILLLVALIiGSMTW 
TVKAVAGMLVLS IOGNIiOLDYPIFYVMPVCMVATAVYQAAFLSQ 
ASQMYDSSLIASVGYILSTTIAITAGAIFYLDFIGEDVLHICMF 
ALGCIiIAFLGVFLITRNRKKPIPFEPYrSMDAMPGMQNMHDKGM 
TVOPELKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKE 


6363 


21 


1201 


RRTRLGSSFPRRRDSSAMESYDVIANQPWIDNGSGVIKAGFAG 
DQIPKYCFPMYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMBHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
bfeLfe VTHA V fc* I Y EG FAM PHS I MRI D I AGRDVS RFLRLY LR {CEG Y 
DFHSSSEFEIVKAIKERACYLS1NPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGS ILAS LDTFKKM WVS KKEYEEDGARS IHRKTF 


63 (U 


21 


1201 


RRTRLGSS FPRRRDS 3AMES YDVI ANQP WI DNGSGVI KAGFAG 
DQ 1 PK YCF PNYVGRPKHVRVMAGALEGD I FIGPKAEEHRGLLS I 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEBHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPI YEG FAMPHS IMRIDI AGRDVS RFLRLYLRKEG Y 
DFHSSSEFE IVKAIKERACYLS INPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E=* 
Clutamic Acid, F=-- Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glu t amine , R=Arginine, 
S=Serine, TsThreonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








RTLFSNIVLSGGSTLPKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGS I LAS LDTFKKMWVS KXE YEEDG ARS I H RKTF 


6365 


234 


1959 


KHKSRASCAARAQAFGPSREREVHSRFRSGLRRiiGESilSGCCTW 
ASMGTLAFDEYGRPFLI I JCDODRJCSRLMGT,RJ*T.T( , QUTiv& airawa 
NTMRTS l^PNGLDKIWWDKDGDVTVTNDGATI LSMMDVDHQIAK 
LMVELSKSQDDE I G DGTTG VWLAG AL LE EAE Q LLDRG I HP I RI 
ADGYEQAARVAI EHLDKI SDSVLVDI KDTEPL IQTAKTTLGSKV 
VNSCHRQMAEIAVNAVLTVADMERRDVDFELIKVEGKVGGR LED 
TKLIKGVIVDKDFSHPQMPKKVEDAK1AILTCPFEPPKPKTKHK 
LDVTSVEDY KALQKYFKE KFEEMI QQ I KETGANLAICQWGFDDE 
ANHLLLQNNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
GFAGLVQEISFGTTKDKMLVIEQCKNSRAVTIFIRGGNKMIIEE 
AKRSLHDALCVIRNLIRDNRVVYGGGAAEISCALAVSQEADKCP 
TLEQYAMRAFADALEVI PMALSENSGMNPIQTMTE VRARQVKEM 
NPAIX3IDCLHKGTNDMKQQHVIETLIGKKO^ISIoAlWVRMILK 
IDD1RKPGESEE 


6366 


257" " 




vjJNJ\c>V7AHbb lb W VLiDa 1 r LfaAVAMLCKEOGITVLGLNAVFDILV 
IGKFWVLEIVQKVLHKDKSLENLGMLRNGGLLFRMTLLTSGGAG 
MLYVRWRIMGTGPPAFTEVDNPAS FADSMLVRAVNYNYYYSLNA 
WLLLCPWWLC FDWSMGCI PLI KS I SDWRVI ALAALWFCLIGLI C 
QALCSEDGHKRRILTLGLGFLVIPFLPASNLFFRVGFWAERVL 
YLPSVGYCVLLTFGFGALSKHTKKKKLIAAWLGILFINTLRCV 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRLNPKYVHAMNNLGNIL KERNE LQEAEELLS LAVQI Q 
PDFAAAWMNLGIVQNSLKRFEAAEQSYRTAIKHRRKYPDCYYNL 
GRL YADLNRH VD ALN AWRNATVLKPEHS LAWNXJM 1 1 LLDNTGNL 
AQAEAVGREALELIPNDHSU4FSLANVLGKSQKYKESEALFLKA 
IKANPNAASYHGNLAVLYHRWGHLDLAKKHYEISLQLDPTASGT 
KENYGLLRRKLELMQKKAV 


6367 


287 


1934 


oiuf r viMijvLhb j.Li1j I A^fcnryUoVAr fcLJVAvb rTQEEWALLDPS 
QKNLYRDVKQETFKNLTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCESKESHHCGESFNQIADDMLNRKTLPGITPCESSVCGEVGT 
GHSSLNTHIKADTGHKSSEYQEYGENPYRNKECKKAFSYLDSFQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFLNLCLIHERIHTGVKPYKCKQCGKAFTRSTTLPVHER 
THTOVNADECKECGNAFSFPSEIRRHKRSHTGEKPYECKQCGKV 
FISFSSIQYIIKMTHTGEKPYECKQC!GKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEICTKTEDKPYGCKQCGKGFRCA 
SQLQIHERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
ECKQCG KAFRY FS SLH I HERTHTGDKP YECKVCG KAFTCS S S I R 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFI RASS CREH ERTHT INR 


6368 


1 


327 


RPVPAKLNPRSWPRTAGALPLRPPPLTMAVFHDEVBIEDFQYDE 
DSETYFYPCPCGDNFSITKBDLENGEDVATCPSCSLIIKVIYDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVl'RTIHGSPREDTGT " 

PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 

F KNLTS VG KT WKVQN I EDE YKN P RRNLS LMRE KLCE S KESHHCG 

ESFNQIADDMLNRKTLPGITP CESS VCGEVGTGHSSLNTH1 RAD 

TGHKSSEYQEYGBNPYRNKECKKAFSYLDSFQSHDKAC7KEKPY 

DGKECTETFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 

IKER I HTGVKP YKCKQCGKAFTRSTTLPVHERTHTGVNADECKE 

CGNAFSFPSEIRRHKRSHTOEKPYECKQCGKVFISFSSIQYHKM 

THTGEKPYECKQCGKAFRCGSHLQKHGRTHTGEKPYECRQCGKA 

FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 

EKPHECKECX3KVFKYFSSLRIHER7HTGEKPHECKQCGKAFRYF 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T*Threonine, V»Valine, 
W= Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSLHIHgRTHTGDKPYECKVCGKAFTCSSSIRYHERTHTGEKPY " 

ECKHOGKAFISNYIRYHSRTHTGEKPYQCKQCGKAFIRASSCRE 
HERTHTINR 


6370 


1711 


329 


F VLS EQRLRTERTW PRS PGLGRGAAAAGARTAGAGLLRLLLGCG 
ALVGGLRPVTMTTPANAQNASKTWELSLYEIiHRTPQBAIMDGTE 
IAVSPRSLHSELKCPICLDKLKNTI1TTKECLHRFCSDCIVTALR 
SGNKECPTCRKiCLVSKRSIiRPDPNFDALISKIYPSREEYEAHQD 
RVLIRLSRLHNQQALSSSIEEGLRMQAMHRAQRVRRPIPGSDQT 
TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SS VGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGP PS P PGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
LALR I ALERRQQQEAGEPGGPGGGAS DTGGPDGCGGEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTIiNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDHLGECKSFKEKFMKC - 

LHNNNFENALCRKESKEYLECRNIERKLMLQEPLEKLGFGDLTSG 
KSEAXK 


6372 


2l4l 


62S 


RVSArASEGKAEERYKKLEDIiLEKSFSLVKMPSLQPWMCVMKH 
LPKVPEKKLKLVMADKEbYRACAVEVRRQIWQDNQALFGDEVSP 
LLKQYILEKESAIiFSTELSVUWFFSPSPKTRRQGEWQRLTRM 
VGKN^/KLYDMVLQFLRTLFLRTRNVHYCTLRAELLMSLHDLDVG 
E I CTVDPCHK FT WCLDAC I RER F VDS KRARB LQGFLDG VKKGQE 
QVLGDLSMILCDPFAINTLALSTVRHLQELVGQETLPRDSPDLL 
LLLRLLALGQGAWDKIDSQVFKE PKMEVEL ITRFLPMLMS FLVD 
DYTFNVDQKLPAEEKAPVSYPNTLPESFTKFLQEQRMACEVGLY 
YVLHITKQRNKNALLRLLPGLVETFGDLAFGDIFLHLLTGNLAL 
LADE FALEDFCSSLFDGFFLTAS PRKENVHRHALRLLIHLHPR V 
APSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PSRAARAS P ARL2AMV S W 1 1 S RLWL I FGTLYPAYYS YKAVKS K 
DIKEYVKWMMYWIIFALFTTAETFTDIFLCWFPFYYELKXAFVA 
WL L S P YTKGSS LLYRKFVH PTLS S KE KE I DDCL VQAKDR S YDAL 
VH FG KRGLNVAATAA VMAAS KGQGALS ERLRS FS MQDLTT X RGD 
GAPAPSGPPPPGSGRASGKHGQPKMSRSASESASS5GTA 


6374 


535 


2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHT FCNYTS STI FLS STRDHSCPTHTSCNYTS ST I FLS S TRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPAELQTEGSNGKKBVLSGFQWLEDTVLPPEGGGQPDDRGTIN 
DISVLRVTRKGEQADHFTQTPLDPGSQVLVRVDWERRFDHMQQH 
SGQHL I TAVADHLFKLKTTS WELGRFRSAI ELDTPSMTAEQVAA 
IEQSVNEKIRDRLPVNVRELSLDDPEVEQVSGRGLPDDHAGPIR 
VVN I EG VDS NM CCGTHVS NLS DLQVI KILGTE KG K KNRTNL I FL 
SGN R VL KWMERS HG TE KALTALLKCG AE DHVEA VK K T iQNST K I L 
QKNNLNLLRCLAVHIAHSLRNSPDWGGWILHRKEGDSEFMNII 
ANEI GS E 3TLLFLT VGDE KGGGL FLLAG P PAS VETLGPR VAEVL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKS 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRPRRPGQQWRG PTMLVTAYLAFVGHjASCLGLELS RCRAK 
PPGRACSN P S FLR FQ IJJFYQVYFIiALAADWLQAP YL Y KL YQH Y Y 
FLEGQIAILYVCGLASTVLFGLVASSLVDMLGRKNSCVliFSLTY 
SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVBR 
HDFPAEWI PATFARAAFWNHVLAWAGVAAEAVASWIGLGPVAP 
FVAAI PLLALAGALALRNWGENYDRQRAFSRTCAGGLRCLLSDR 
RVLLLGTI QALFESVI PI FVFLWTPVLDPHGAPLGI I FSSFMAA 
SLLGSS LYR IATSKRYHIX3PMHLLSLAVLI WFSLFMLTFSTS P 
GQES P VES F I AFLL I E LACG Ii YFPS MS FLRR KVI PBTEQ AGVLN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid se anient eon i-aini nn o 4 nnal ~ ^ . _-, 

ocyiu^uw v.^tic-ajtmng signax peptide 

<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 

Glutamic Acid, F=Phenylalanine, G=Glycine, 

H=Histidine, I=Isoleucine, K=Lysine, 

L=Leucine, M=Methionine, NaAsparagine, 

P=Proline, Q=Glutamine / R=Arginine, 

S=Serine, T=Threonine, V=Valine, 

W=Tryptophan, Y=Tyrosine, X»Unknown, *«Stop 

Codon, /^possible nucleotide deletion 

\=possible nucleotide insertion) 








WFRVPI^SlACLGLLVUlDSDRKTGTRNMFSICSAVMVl^lAV 
VGLFTWRHDAELRVPSPTBEPYAPBL 


6376 


380 


1437 


QDLLIAALGMKLGSPKSSVTIWQPLKLFAYSQLTSLVRRATLKE 
NEQI PKYEKIHNFKVHTFRGPHWCEYCAKFMWGLIAQGVKCADC 
GLNVHKQCSKMVPNDCKPDLKHVKKVYSCDLTTLVKAHTTKRPM 
VVDMCI RE I ESRGLNSEGLYRV<?f:F<: m . t rnvvM& tmoTVDv * r\ 

ISVNMYEDINIITGALKLYFRDbPIPLITYDAYPKFIESAKIMD 

PDEQLETLHEALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLGIVFGPTLMRSPELDAMAAT MHTOvriDT tmcT t Twrj-.T>-rT t-i 


6377 
6378 


2311 


1845 


i> R I RRRS SRk PRE P PGPSRRRRRRRPDPRTMPSE KTFKQRRTFE " 
QRVSDVRLIREQHPTKIPVIIERYKGEKQbPVLDKTKFLVPDHV 
NMSEL I KI I RRRLQLNANQAFFLLVNGHSMVSVSTP ISEVYESE 
KDEDGFI»YMVYASQETFGMKLSV 




606 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNMAVA 
DLALI PDVDI DSDG VFKYVLI RVHSAPRSGAPAAESKE I VRG YK 
WAEYHADIYDXVSGDMQKQGCDCECLGGGRISHQSQDKKIHVYG 

YSMAYGPAO W A T QTT? V T V& WT5 nV-CTlTTHVl'n unnu 


6379 


35 


37B 


BRAG S PS PS KAALR R CAPQRS QA PRWPDRAACRRS FQGS QGRAY 
LFNS WNVGCG PAEER VLLTGLHAVAD I YCENCKTTLG W K YEHA 
FESSQKYPCEGKYIIELAHMIKDNGWD 


' 6380 


1414 


462 


P AVQGQRGAG P PTGRG SGNMAR FALT VVRHG ETliFN KE KI I QGQ 
G VD E P LS ETGFKQAAAAG I FLNNVKFTHA FS SDLMRTKQTMHGI 
L ER S KFC KDMT VKYDSRLRER K YG WEG KAL S ELRAMA KAARE E 
CPVFTPPGGETLDQVKMRGIDPFEFLCQLILKEADQKEQFSQGS 
PSNCLETSLAEIFPLGKNHSSKVNSDSGIPGLAASVLWSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELMSVTPNTGMSLFIINFEEG 
REVKPTVQCICMNI^DHIiNGLTENSLGriNLPSKSNHFEPLKGVP 
LALFTSLLC 


6381 


1668 


218 


AWRAQGSRGFSGA<3WRPRQAAAMNFSEVFia6SLLCKFSPDGK 
YLASCVQYRLWRDVNTLQILQLYTCLDQIQHIEWSADSLFILC 
AMYKRGLVQVWSLEQ PEWHCKIDEGSAGLVAS CWS PDGRH I LNT 
TBFHLRrTVWSLCTKSVSYIKYPKACLQGITFTRDGRYMALAER 
RDCKDYVSIFVCSDWQLLRHFDTDTQDLTGIEWAPNGCVIaAVWD 
TCLBYKXLLYSLDGRLLSTYSAYEWSLGIKSVAWSPSSQFrAVG 
SYIX3KVRILNHVTWKMITEFGHPAAINDPKIVVYKEAEKSPQLG 
LGCLSFPPPRAGAGPLPSSESKYEIASVPVSLQTLKPVTDRANP 
KIGIGMLAFSPDSYFLATRNDNIPMAVWVwnTnvT.or ?nm pht- 

SPVRAFQMDPQQPRLAICTGGSRLYLWSPAGCMSVQVPGEGDFA 
VI^LCWHIjSGDS MALLS KDHFCLCFLE TEA WGTACRQLGGHT 


6382 


2 


1062 " 


tEEDEDRNIiCLIAYPIjKGDHGIVDIVDNSDCEPKSKLLRWTTNK 

KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 

TVEKKGNISSQLKHYNPWSMKCHQQQLQRMICENAXHR1JQYKFIL 

LENLTSRYEVPCVLDLKMGTRQHGDDASEEKAANQIRKCQQSTS 

AVIGVRVCGMQVYQAGSGQLMFMNKYHGRKLSVQGFKEALFQFF 

HNGRYLRRELLGPVLKKLTELKAVLERQESYRFYSSSLLVIYDG 

KERPEWLDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 

IDFAHTTCRLYGEDTWHEGQDAGYIFGLQSLIDIVTEISEESG 
E 


6383 


3159 


1061 


spapgrpsphgsqpaaraaaapampsakqrgskgghgaaspsek 

GAHPSAARPLAAPTPAAPACRSPSPGGAPASFPGRAPRSLASQP 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
PPPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
3 WC VHH VLE E VQQ VRRSHQDFS RQREELG QGLQG VEQKVQSLQA 
rFGTFESILRSSQHKQDLTEKAVKQGESEVSRISEVLQKLQNBI 
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SEQ" 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide'" 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, VeValine, 
w-iryptopnan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKDLSDG I HWKDARER D FTS LENTVEERLT ELTKS I N DN I A I F 
TEVQKRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSRBWDMEALRSTLQTMESDI YTEVREL VSLKQEQQAFKEAAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEBLRQLKSDSHGP 
KEDGGPRKSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
ESLESLLSKSQEHEQRLAALQGRIjEGLGSSEADQDGIjASTVRSL 
G BTQLVLYGDVE BLKRS VGELPSTVESLQKVQEQVHTLLSQDQA 
QAARXjPPQDFLDRLSSLDNLKASVSQVEADLKMLRTAVDSLVAY 
S VK I ETN ENNLES A KGLLDDLRN DLDRLF VKVEK I HE KV 


6384 


738 


1904 


IWEVPVCLTHIiLHLQQANQPLPPPSSSINEEDADEANRAIGEKR " 
AAPDSGKKPKTPKTKQQKDPNEPQXPVSAYALPFRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVSKAAAESAEAQTIRSVQQTLASTNLTSSLLLNTPLSQ 
HGTVSAS PQTLQQSLPRSI APKPLTMRLPMNQI VTS VT I AANM P 
SNI GAP L I SSMGTTMVGSAPSTQVSPS VQTQQHQMQLQQQQQQQ 
QQQMQQ MQQQQLQQHQMHQQ IQ QQMQCXJHFQHHMQQH LQQQQQH 
LQQQ I NCX3QLQQQLQQRLQLQQLQHMQHQSQPS PRQHS PVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVLSQVSIP 


6385 


2 


1584 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LAQGPDAATTDELS SLGSDSEANG FAERR IDXFG FI VGS QGAEG 
ALEE VPLEVLRQRESKWLDP^LNNWDKWMAKKHKKIRLRCQK(4I P 
PSLRGRAWQYLSGGKVKLQQNPGKFDELDMSPGDPKWLDVIERD 
LHRQFPFHEMFVSRGGHGQQDLFRVLKAYTLYRPEEGYCQAQAP 
IAAVLLMHMPAEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWPMCAFSRTLPWSSVL 
RVWDMFFCEGVKI I FRVGLVLLKHALGS PEKVKACQGQYETIER 
LRSLSPKIMQEAFLVQEWELPVTERQIEREHLIQLRRWQETRG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLDAPLPGS 
KAKP KP PKQAQ K EQRKQMKGRGQLE KP PAPNQAM WAAAGDACP 
PQHVP P KDSAPKDS APQDLAPQVSAHHRSQESLTSQES EDTYL 


6386 


819 


195 


TVCGSFYLGIMQRASRLKRELHMLATEPPPGITCWQDKDQMDDL - " 
RAQILGGANTPYEKGVFKLEVIIPERYPFEPPQIRFLTPIYHPN 
IDSAGRICLDVLKLPPKGAWRPSLNIATVLTSIQLLMSEPNPDD 
PLMAD I S SE FKYN KPAFL KNARQWTE KHARQ KQ KADEEEMLDNL 
PEAGDSRVHNSTQKRKASQLVGIEKKFHPDV 


6387 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KQELAETLANLERQIYAFEGSYLEDTQMYGNIIRGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
nux-uov7 1 oouisfur ™>'Wcwlit'iytJJireiL/ijiAjo V^vj VKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6388 


1 


662 


PGPTHAS ADAWADAWAQ PNMAMHNKAAP PQ I PDTRRELAELVKR 
t^auna i urttvjjCiXW J. iHc tbo I uaV 1 yWiGNII RGWDR YLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6389 


1074 


497 


AEPGDRMAGHRLVLVLGDLHIPHRCNSLPAKFKKLLVPGKIQHI 
LCTGNLCTKESYDYLKTI^DVHIVRGDFDENLNYPEQKVVTVG 
QFKIGLIHGHQVIPWGDMASIxALLQRQFDVDILISGHTHKFEAF 
EHENKF Y I N PGS ATG AYNALE TN HPS F VLMDIQ AS TWTY VYQ 
LIGDDVKVERIEYKKP 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTIAQRDDGVFVQEVTQNSPAARTGWKEGDQIVGATIYFDNLQ 
SG E VTQ LLNTMGHHTVG LK LHR KGDR FF PS LGQTWD P 


6391 


5386 


2897 


VRWNS KT E CYLS I OTQEN FPANLNEL VNCI VI SSLVTTQRKLKA 
MSLLGSRNQLARAVLNPNPMDFCTKDLLTTTSERIIAYLRDFNE 
DQKKAI ETAYAM VKHSPSVAK I CL IHGPPGTGKS KT I VGLLYRL 
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~SEQ 
ID 

NO: 



6392 



-£394- 



6395" 



6397 



Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



972 



2017 



1418 



"13" 



391 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



186 



730 



"SIT 



6S8 



Amino acia segment containing signal peptide 
<A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
! Glutamic Acid, F= Phenylalanine, G^Glycine, 
I H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 
LTENQRKGHSUfiNSNAKIKQ NRVLVCAPSNAAVDEU^KKIILEF 

KEKCKDKfCWPr/^NPrjnTMT.UTJT J^nt?rrr« -r»Tn.— x », 



KEKCKDKKNPLGNCGDINLVRLGPBKSINSSVLKFSLDSQVNHR 
MKKELPSHVQAMHKRKEFLDYQbDBLSRQRAliCRGGREIQRQEL 
DEN I S KVS XERQELAS KI KBVQGR PQKTQS 1 1 1 LES H I ICCTLS 
TSGG LLLE S AFRGQGG VP FS CV I VDEAGQS CEI ETLTPLIHR CN 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
MISRLPILQLTVQYRMHPDICLFPSNYVYNRNLKTNRQTEAIRC 
SSDWPFQPYLVFDVGDGSBRRDNDSYINVQEIKLVMEIIKLIKD 
KR KD VS FRN IG 1 1 TH Y KAQKTM I Q KDIiDKE FDRKG PAE VDT VD A 
FQGRQFCDCVI VTCVRANSIQGS IGFLASLQRLNVTITRAKYSLF 
I LGHLRTLMENQHWNQL IQDAQ KRGA 1 1 KTCDKNYRHDAVKI LK 
LKPVLiQRSLTHPPTlAPEGSRPQGGLPSSKLDSGFAKTSVAASli 
YHTPSDSKEITLTVTSKDPERPPVHDQLQDPRLLKRMGIEVKGG 
IFLWDPQPSSPQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LSSHKPPVRGEPPAASPEASTCQSKCDDPEEELCHRREARAFSE 
GBQEKCQSETHHTRRNSRWDKRTLEQED SSSKKRKLL 
. G RTG VDLAS S MAHRliQ I r LLTWDVK DT LLK LRU PLG EA YATKAR " 
AHGLEVEPSALEQGFRQAYRAQSHSFPNYGIiSHGLTSRQWWLDV 
VLQTFHLAG VQDAQAVAP I AEQL Y KDFS HPCTWQVLDGAE DTLR 
ECRTRGLRLAVISNFDRRLEG ILGGLGLRBHFDF VLTS EAAGWP 
KPDPRIFQEALRLAHNEPWAAHVGDNYIiCDYQGPRAVGMHSFL 
WGPQALDPWRDSVPKEHILPSIAHLLPALDCL EGSTPGL 
TGGSKMAAVATCGSVAASTGSAVATASKSNVTSFQRRGPRASVT 
NDS G P RLVS I AGTR PS VRNGQLLVS TG L PALDQ LLGGGLAVGTV 
LL I EEDK YN I YS PkLFKY FLAEG I VNGHTLLVASAKEDPANILQ 
I ELPAPLLDDKCKKEFDEDVYNHFCTPESNIKMKIAWRYQLLPKME 
i IGPVSSSRFGHYYDASKRMPQELIEASNWHGFFLPEKISSTLKV 
EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGSPLWGDDICCAENGGNSHSLTKFLYVLRGLLRTSLSACIITM 
PTHLIQNKAIIARVTTLSDWVGLESFIGSERETNPLYKDYHGli 
IHIRQlPRLNNLICDESDVKDLAFKIiKRFCLFTIERLHLPPDLSD 
TVSRSSKMDIiAESAKRLGPG CGMMAGGKKHLDF 

gaaaggeglarrrpaamatvma a'Jaaeravleee frwllhdevha 

VLKQLQDILKEASLRFTLPGSGTEGPAKQENFILGSCGTDQVKG 
I VLTLQGDALSQADVNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 

NHVSQAIYLLTSRDQSYQFKTGAEVLKLMDAVMLQLTRARNRLT 
j TPATLTLPEIAASGLTRMFAPALPSDLLVNVYINLNKtiCLTVYQ 

LHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRLEVSHVHKVEC 

VIPWLNDALVYFTVSLQLCQQLKDKISVFSSYWS YRPF 

psgrptrplccaarrgaarkggsvsgwpAgrtptetsnpgssvm" 

ES VT F ED VAVE FIQE WALLD S ARRSLC KYRMLDQCRTLAS RGT P 
PCKPSCVSQLGQRAEPKATERGILRATGVAWESQLKPEELPSMQ 
DLLEEASSRDMQMGPGLFLRMQLVPS IEERETPLTREDRPALQE 
| PPWSLGCTGLKAAMQIQRWI PVPTLGHRNPWVARDSGE 



ANILSSPSKRGQKGTLIGYSPEGTPLY NFMGDAFQHSSQSIPRF " 
IKESLKQILEESDSRQIFYFLCLNLLFTFVELFYGVLTNSLGLI 
SDGFHMLFDCSALVMGLFAALMSRWKATRI FS YGYGRI EILSGF 
INGLFLIVIAFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
IG1CAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMNANMRGVFLHVLADTLGSIGVIVSTVLIEQFGWFIAD 
PLCSLFIAILIFLSWPLIKDACQVLLLRLPPBYEKELHIALEK 
I Q KI EG L I S YRDPH FWRHS AS I VAGT I H I Q VTSD VLEQRI VQQV 

TG1LKDAGVNNLTIQVEKEAYFQHMSGLSTGFHDVLAMTKQMES 
MKYCKDGTYIM 



122 



GAGGVGR FEAI RAP ARM I E WCNDRLG KKVRVKCNTDDTI GDLK~ 
KLIAAQTGTRWNKIVLKKWYTIFKDHVSLGDYEIHDGMNLELYY 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first . 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide^ 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 

Glutamic Acid, ^Phenylalanine, G=Glycine, 

H=Histidine, I=Isoleucine # K=Lysine, 

L=Leucine, M=Methionine, N=Asparagine 

P=Proline, Q=Glutamine, R^Arginine, 

S=Serine, T=Threonine, V=Valine, 

W=Tryptonhaji y=Tvt*oqttio v t t— i , . _ . 

/i-wv^utui, i-iytosme, A=unKnovn, =Stop 

Codon, /=possible nucleotide deletion, 

\=possible nucleotide insertion) 

Q 


6398 
6399 


353 


1306 


HKQMGPLINKCKKILLPTTVPPATMRIWLLGGLLPFtiLLLSGLQ 
RPTEGSEVAIKIDFDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
DIEAQKWFRMWQKAHIJVWLNQGKVLPQNMTTTHAVA1LFYTLN 
SNVHS DFTRAMAS VARTPQQ YERS FH FKYLH YYLTS A I Q LLRKD 

SIMENGTLCYBVHYRTKDVHFNAYTGATIRFGQFLSTSLLKEEA 

OEFGNOTIjPTT FTPT/liDVnvt?cr vvrtir -r™ •»■»-.,,-» — ~ 

worvnyiur ixc i^i^^^vyx^ijKKEVLIPPYEIjFKVINMSYH 

PRGDWbQLRSTGNLSTYNCQLLKASSKKClPDPIAIASLSFLTS 
VIIFSKSRV 


6400 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRKRTWS"KRGVAVSGPTK 
RRGMADSLESTPLPSPEDRLAKLHPSKELLEYYQKKMAECEAEN 
EDLLKKLEL YK B ACEGQHKLECDLQQRS EE IAE LQKAIjS DMQ VC 
LFQEREHVLRLYSENDRLRIRELEDKKKIQNLIALVGTDAGEVT 
YFCKEPPHKVT1LQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAOLGEQTKLSREQIEGLIED 
RRIHLEEIQVQHQRNQNKIKELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDNLMSKIKQYRVQCKKKEDKIGKVLPVMHE 
SHHAQS B YI KVMS LCRNE WY FS GRVEG I P KNLQ FVM 


6401 


2520 


1053 


KTMKCDEWYEVQSAILRHNCGYAMKTGKFFHNLMERKDFETWL 
DNISVTFLSLTDLQKNBTLDIILISLSGAVQLRHLSNNLETLLKR 
DFLKLLPLELSFYLLKWLDPQTLLTCCLVSKQWNKVISACTEVW 
QTACKNLGWQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFKTS 
SLIGHSARVYALYYKDGLLCTGSDDLSAKLWDVSTGQCVYGIQT 
HTCAAVK FDEQ KL VTGS FDNT VACWE WS SGARTQH FRGHTGAVF 
S VDYNDELD I LVS GS ADFTVKVWALS AGTCLNTLTGHTE WVT KV 
VLQKCKVKSLLHSPGDYILLSADKYEIKIWPIGRBINCKCLKTb 
S VSEDRS I CLQPRLH FDGKYI VCSS ALGL YQWD FAS YD I LR V I K 
TPE I ANLALLG FGD I FALLFDNRYLY I MDLRTESLI S RWPIjPE Y 

RKSKRGSSFLAQEASWLNGLDGHNDTGLVFATSMPDHSrHLVLW 
KEHG 


64 02 


109 


766 


t vj/iawijKFUijKbLC i IjPOPALRMLvLiPSPCPQPLAfSSVETMEG 
PPRRTCRSPEPGPSSSIGSPQASSPPRPNHYLLIDTQGVPYTVL 
VDEESQREPGASGAPGQKKCYSCPVCSRVFEYMSYLQRHSITHS 
EVKPFECD I CGKAFKRASHLARHHS IHbAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLOKHTRWKHP 


6403 


1196 


279 


TTSQCGGIRQSSAIPVASMEFAArCLRNALLLLPEEQQDPKQEN 
GAKNSNQLGGNTES S ES S ETCS S KS HDGD KP I P APPS S P LRKQE 

LENLKCSILACSAYVALAIX3DNLMALNHADKLLQQPKLSGSLKF 
LGHLYAAEALISLDRISDAITHLNPENVTDVSLGISSNEQDQGS 
DKGENEAMES SG KRAPQCYPS S VNS ART VML FNLGSA YCLRS E Y 
DKARKCLHQAASMIHPKEVPPEAILLAVYLELQNGNTQLALQII 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6404 


2 i 
1012 


1690 

! 

J 

222 "I 


RGIHTSVU?GNLQNQ^lYSHNVVI^INIINNLNLTQVQQR^^LItNLQ 

RS VDDTSQ AI QR IKND FQN LQQVFLQ AKKDTD W LKE KVQS LQTL 

AANNSALAKANNDTLEDMNSQLNS FTG QM EN I TT I S QANEQNLK 

DLQ DLH KDAENRTA I KFNQLEERFQL FETD I VN 1 1 SN I S YTAHH 

LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTLANIRLDSVSLtR 

MQQDLMRSRLDTEVANLSVIMEEMKLVDSKHGQLIKNFTILQGP 

PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGBRGPIG 

PAGPPGERGGKGSKGSQGPKGSRGSPGfCPGPQGPSGDPGPPGPP 

SKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 

KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHWKNFTDKCYY 

FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 

tflGLTDSERENEWKWLDGTSPDYKNWKAGQPDNWGHGHGPGEDC 

^GL I YAGQWNDFQCEDVNNFICEKDRETVLSS AL 

WVLAMAA PAVGIj I S VFS S SQE IX3AAIiAQL VAQRAACCLAGARA 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end - 
nucleotide 
location 
corresponding 
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amino acid 
residue. of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T» Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RFALGLSGGSIiVSMLARELPAAVAPAGPASLARWTLGFCDERLV^ 
PFDHAESTYGLYRTHLLSRLPIPESQVITINPELPVEEAAEDYA 
KKIiRQAFQGDS I PVFDLLI LGVGP DGHT CS LFPDHP LLQERE KI 
VAPISDSPKPPPQRVTLTLPVLNAARTVIFVATGEGKAAVLKRI 
LEDQEBNPLPAALVQPHTGKLCWFLDEAAARLLTVPPEKHSPL 


6405 


1 


±*xOO 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRLVAVATLRNHRPR ' 

TAQRAAAQVLGSSGLFmHGI^VQQQQQRNLSLHEYMSMELLQE 

AG VS VP KG YVAKS PDEAYAIAKJCLGS KDWI KAQVLAGGRGKGT 

FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 

VLVCERKYPRREYYFAITMERSFQGPVLIGSSHGGVNIEDVAAE 

TPEAIIKEPIDIEEGI KKEQALQLAQ KMG F PPNI VE S AAENMVK 

LYS L FL KYDATM I E I N PM VE DS DG AVLCMDAXI N FDS NS A YRQK 

KI FDLQDWTQEDERDKDAAKANLNYIGLDGNIGCLVNGAGLAMA 

TMDIIKLHGGTPANFLDVGGGATVHQVTEAFKLITSDKICVLAIL 

VNIFGGIMRCDVIAQGIVMAVKDLEIKIPWVRLQGTRVDDAKA 

LIADSGLKILACDDLDEAARMWKJ.SEIVTLAKQAHVDVKFQLP 


6406 


1036 


167 


HPRQMRGEDTPEAPPYSSGRYDSIKTEVSGCPEDLTVGRAPTAD 
DDDDDHDDHEDNDKMNDSEGMDPERLKAFNMFVRLFVDENLDRM 
VPISKQPKEKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
KNGMEMTRPTPPHLTSAMAENIIAAACESETRKAAKRMRLEIYO 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGESGEARAIxASRPAPSWV 
CRAALGSGMGRG KQR P VMERG CLTA 


6407 


492 


150 


VGLCIAVSQTVlAQLDALIiVFPGQ VAQLSCTLS PQHVT I RDYGV 
SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAflNAC 
VLTI S P VQPEDDADYYCS VG YG FS P 


6406 


1458 


903 


RGC I TSS QAWRLFGG VTRG FNMR I E KCY FCS GP I YPGHGMM FVR 

NDCKVFRFCKSKCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 

SFEFEKRRNEPIKYQRELWNKTIDAMKRVEEIKQKRQAKFIMNR 

LKKNKELQKVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
uv un.E,Uf\}? 


6409 


1*0 


446 


NTALANLLR C FTCDRLCGGCTAPAP PAHQG I VLQP VM PS CDPG P 

GPACLPTKTFRSYLPRCHRTYSCVHCRAHLAKHDELISKSFQGS 
HGRAYLFNSV 


6410 


85 


607 


KGGTAGCVACLGCWGQSSSPKAAFPAGSACLPADSCPCLLFQAC 
AI SGLFNC I TI HPLN IAAG VWMI MNAFILLLCEAPFCCQ FI EFA 
NTVAEKVDRLRS WQKAVF YCGMAWP I VI S LTLTTLLGNAI AFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLEGEL 


6411 


302 


772 


RLS I MAS S IiN EDPEGS R 1 TYVKGDLFACPKTDS LAHCIS E D CRM 
GAGIAVLFKKKFGGVQELLNQQKKSGEVAVLKRDGRYIYYLITK 
KRASHKPTYENLQKSLEAMKSHCLKNGVrDLSMPRIGCGLDRLQ 
WENVSAM I EEVFEATD I KI TV YTL 


6412 


61 


1709 


rpvtsfsplpgscggrlgtrtmlgrslrevsaalkqgqitptel 
cqkclslikktkflnayitvseevalkqaeesekrykngqslgd 
ldgipiavkdnfstsgiettcasnmlkgyippynatvvqklldq 
gallmgktnldefamgsgstdgvfgpvknpwsyskqyrekrkqn 
phseneds dwl i tggssggsaaavs aftcyaalgsdtggstrnp 
aahcglvgfkpsyglvsrhgliplvnsmdvpgiltrcvddaaiv 
i/5alagpdprdsttvhepinkpfmlpsladvsklcigipkeylv 
pe ls s e vqslws kaadl fe s eg akvi e vs lphts ys i vc yhvlc 

tsevasnmarfdglqyghrcdidvsteamyaatrregfndwrg 
r i lsgn ffllkenyeny fvkaq kvrr l i and fvnafns g vd vll 
tpttls ea vp yle f i kednrtrs aqdd i ftq avnmaglpavs i p 
valsnqglpiglqfigrafcdqqiiltvakwfekqvqfpviqlqe 

IjMDDCSAVLENEKLASVSLKQ 
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nucleotide 
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residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 

Leucine, M=Methionine, N=Asparagine, 
?=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
^Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


OD3 


HEP RCAGMAAS LWMGDLE P YMDENF I S RA FATMG ET VMS V KI I R " ~ 

NRLTG I PAG YC FVE FAD LATAE KCLH KI NG KPLPGAT P AKR FKL 

NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFEVKVYPSCRG 

GKWLDQTGVSKGYGFVKFTDELEQKRALTECQGAVGLGSKPVR 

LS VAI P KAS RVKPV E YSQM YS YS YNQY YQQYQN Y YAQWG YDQNT 

GSYSYSYPQYGYTQSTMQTjrEEVGDDALEDPMPQLDVTEANKEF 

MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPKPQPARPSSRATPGPRSPGMATSIGV 

SFSVGDGVPEAEKNAGEPENTYILRPVFQQRFRPSWKDCIHAV 

LKEELANAEYSPEEMPQLTKHLSENIKDKLKEMGFDRYKMWQV 

VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCVVAAFGC 
FYY. 


6415 


2 


1166 


FVRQWQSSHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP 
FIEAAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTESSSVSEDGDSS2MDDBDCERRRMECLDEMSN 
LEKQ FTDL KDQLY KER LSQ VDAKLQEVI AG KAP E YLE PLATLQE 
NMQ I RTKVAG I YRELCLESVKNKYECE IQASRQHCESEKLLLYD 
TVQS ELE EKIRRLBEDRHS I D I TS ELWNDELQSRKKRKDPFW PD 
KKKPGWSGPYIVYMLQDLDILEDWmRKAMATLGPHRVKTEP 
PVKLEKHIjHSARSEEGRLYYDGEWYIRGQTICIDKKDECPTSAV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIKHS 


6416 


410 


1519 


EIAPADLEIPACAPVLLSRATSSTT'lSVTGGKMAPSLTQEiiljSHL 
GLASKTAAWGTLGTLRTFLNFSVDXDAQRLLRAITGQGVDRSAI 
VDVLTNRS REQRQL I S RNFQERTQQDLMKS LQAALSGNLERI VM 
ALLQPTAQFDAQELRTALKASDSAVDVAIEILATRTPPQLQECL 
AVY KHNFQ VE AVDG ITSE TSG I LQDLLLALAKGGRDS Y S G 1 1 D Y 
N LA EQDVQ ALQRAEGP S R E ET W VP V7TQRNP EHLIR V FDQ YQRS 
TGQELEEAVONRFHGDAQVALLGLASVIKNTPLYFADKLHQALG 
ETEPNYQVLIRI LISRCETDLLS IRAEFRKKFGKSLYSS LQDAV 
KGDCQSALLALCRAEDM 


6417 


1 


845 


RGESRVLWSELEGEAGGAGGWASSLNARMnNRFATAFVIACVLS 
L I S T I YMAAS I GTDF WYE YRS P VQENSS DLNKS I WDEF I S DEAD 
EKT YNDALFRYNGTVGLWRRCITI PKNMHWYS PPERTES FDWT 
KCVSFTLTBQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL | 
GLMCFGALIGIjCACICRSLYPTIATGILHLLAGLCTLGSVSCYV : 
AGIELLHQKLELPDNVSGEFGWSFCLACVSAPLQFMASAIjFIWA 
AHTNRKEYTLM KAYRVA 


641B 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 

tpappppppcggiachgepakfygydnlqrqpifttqqeaelvq 
yfdcksssgnigedpdhlnqssspsqmfpwmrpqaapgrrrgrq 
tysrfqtlelekeflfnpyltrkrrievshalalterqvkiwfq 
nrrmkwkkennkdkfpvsrqevkdgetkkeaqeleedraegltn 


6419 


1 


973 


PGRPRVRNFDLNSKSII^EFFCTRSIQIPANRSKtAMSKCPIFP -- 
MARSISTSGPLDKEDTGRQKLISTGSLPATLQGATDSLGLEWHL 

PS P DPVTVPYLS PL wwkele s llekegdhai tvad f vdhh P I V 
FWNLVWYFRRLDLPSNLPGL I LSSBHCNKYSKI PRHCMS EDS KY 
VLIQMLWDNMKLHQDPGQPLYILWNAHTQKYPMVHLLQKSDNSF 
NQELLKSMVKS I KMNDVYGPMSQILETLNKCPHFKRQRS LYREI 
LFLSLVALGRENIDIDAFDKEYKMAYDRLTPSQVKSTHNCDRPP 
STGVMECRKTFGEPYL 


6420 


207 


1187 


RKMIDKNQTCGVGQDSVPYMICLIHlLEEWFGVEQLEDYLNFAN 
YLLWVFTPLILLILPYFTIFLLYLTIIFLHIYKRKNVLKEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKIPEDGPALI IFYH 
GAI P I D FY YFMAK I F I HKGRTCRWADHFV FK I PG FS LLLD VFC 
ALHG P RE KCVE I LRSGHLLA I S PGG VREAL I SDE T YN I VWGHRR 
GFAQVAIDAKVPI I PMFTQNIREGFRSLGGTRLFRWLYEKFRYP 
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to first 
amino acid 
residue of 
amino acid 
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location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q*=Glutaraine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
WVTryptophan, Y-Tyrosine, X= Unknown, *=stop 
Codon, /-possible nucleotide deletion, 
_ \=possible nucleotide Insertion) 








FAPMYGGFPVKLRTYLGDPIPYDPQITAEELAEKTKNAVQALID 
KHQRIPGNIMSALLERFH 


6421 
6422 


1844 




WAI^LRRQPERMSNKIiLSPHPHSVVLRSEFKMASSPAVLRASRL 
YQWS LKS S AQ FLGS PQLRQVGQ 1 1 RVPARMAATb I L E PAGRCCW 
DEPVRIAVRGIiAPEQPVTLRASLRDEKGALFQAHARYRADTLGE 
LDLE RAP AbGGS FAGL E PMGLLWALE PE KPLVRLVXRDVRTP LA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRRBPVRVGRVRGT 
LFLPPEPGPFPGIVDMFnTGGGLLEYRASLLAGKGFAVMALAYY 
N YED LPKTM ETLHL E Y FEEAMNYLLSHP E VKG PG VGLLG I SKGG 
ELCLSMASFLKGITAAWINGSVANVGGTLRYKGETLPPVGVNR 
NRI KVTKDG YADI VDVLNS PLEGPDQKS PI P VERAES T FLFLVG 
QDDHNWKSEFYANEACKRLQAHGRRKPQIICYPETGHYIEPPYF 
PliCRASLHALVGSPI I WGGE PRAHAMAQVDAWKQLQT FFHKHLG 
GREGTIPSKV 


6423 


181 


2l33 


EGENLSWFQKFWGDIAKEFYWKTPCPGPFIRYNFDVTKGKIFIE" 

WMKGATTNICYNVLDRNVHEKKLGDKVAFYWEGNEPGETTQITY 

HQLLVQVCQFSNVLRKQGIHKGDRVAIYMPMIPELWAMLACAR 

IGALHSIVFAGFSSESLCERILDSSCSLLITTDAFYRGEICLVNL 

KELADEALQKCQEKGFPVRCCIVVKHIjGRAEIiGMGDSTSQSPPI 

KRS C PD VQ I S WNQG I DIj WWHELMQEAGDE CE PEWCD AEDPLF I L 

YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 

IGWITGHSYVTYGPIaANGATSVLFEGlPTYPDVNRLWSIVDKYK 

VTKFYTAPTAIRLLMKFGDEP VTKHSRASLQVLGTVGE P INPEA 

WLWYHRWGAQRCPIVDTFWQTETGGHKLTPLPGATPMKPGSAT 

FPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 

RFETTYFKKFPGYYVTGDGCQRDQDGYYWITGRIDDMLNVSGHL 

LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 

F3PKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 

LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 


6424 


614 


1237 


ANLKE I PRDkPPE TVLL YLDSNQ ITS I PNE I FKDLHQLR VLNI»S 
KNG I E F I DEHA FKGVAETLQTLDLS DNR I QSVHKNAFNNLKARA 
R1ANNPWHCDCTLQQVLRSMASNHETAHNVICKTSVLDEHAGRP 
FLNAANDADLCNLPKKTTDYAMLVTMFGWFTMVISYVVYYVRQN 
QEDARRHLEYLKSLPSRQKKADEPDDISTW 




1 


1188 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWWAHNRMTbKQLKDRYPRKRFVAGVGANSKISA 
LKGS KGKDWE I PVPVG I S VTDBNGKI IGELNKENDRILVAQGGL 
GG KLIiTNFLPLKGQKR 1 1 HLDLKLI ADVGLVG FPNAGKSS LLSC 
VSHAKPAlADYAFTTriKPELGKIMYSDFKQISVADLPGLlEGAH 
MNKGMGHKFLKH I ERTRQLLFWDI SGFQLSSHTQYRTAFETI I 
LLTKELELYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNM I PERT VE FQHI I P I S AVTGEG IE ELKNCI RKS L 

PlCf^ J\ 'KTfMTXTT^ IS T IT t/"L^/~\T" T M» r-t t n. ■ ■ n 


6425 
6426 


1850 


1144 


LAMEGGGGIPLETLKEESQSRHVLPASFEVNSLQKSNWGFLLTG 
LVGGTLVAVYAVAT PFVTPALRKVCL PF VPATM ICQ I ENWKMLR 
CRRGSLVDIGSGDGRIVIAAAKKGPTAVGYELNPWLVWYSRYRA 
WREGVHGSAKFYISDLWKVTFSQYSNWIFGVPQMMLQLEKKLE 

RELEDDARVIACRFPFPHWTPDHVTGEGIDTVWAYDASTFRGRE 
KRPCTSMHFQLPIQA 




30 


565 


S RG AAVGGMS VAGGE I RGDTGGE DTAA PGRFS FS PEPTLED I RR 
LHAEFAAERDWBQFHQPRNLLLALVGEVGELAELFQWKTDGEPG 
PQGWSPRERAALQEELSDVLIYLVALAARCRVDLPLAVLSKWDI 
NRRR YPAHLARS SS RKYT BL PHGAIS EDQAVGP AD I P C DS TGOT 
ST 


6427 


i%> 959 


AASWGPPHVPKAGKMVSWMICRLWIiVFGMLCPAYASYKAVKTK 
MIRE YVRWMMYW I V FALFMAAE I VTD IFISWFPFYYEI KMAF VL 
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Am^no acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 

TT f f * mm. km • J 1 ■_ — * 

H=Hxstidine # I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, NT^sparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=rValine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


" 6428 






WLLSPYTKGASLLYRKFVHPSLSRHEKEIDAYIVQAKERSYETV 
LS FG KRGLN IAAS AAVQAATKSQGA LAGRLRS FSMQDLRS I S DA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 
VPRAPAR PR E K P LI RSQS LR WKRKP PVREGTSRS LKVRTRKKT 
VPSDVDS 


" 6429 


1982 


444 


SGSGGKMEUHQHVPIDIQTSKLLDWLVDRRHCSLKWQSLVLTIR 
EKINAAIQDMPESEEIAQLLSGSYIHYFHCLRILDLLKGTEAST 
KNI FGRYSSQRMKDWQE 1 1 ALYBKDNTYLVELSSLLVRNVNYEI 
PSLKKQIAKCQQLQQEYSRKEEECQAGAABMREQFYHSCKQYGI 
TGENVRGELLALVKDLPSQLAE IGAAAQQSLGEAIDVYQAS VGF 
VCES PTEQVLPMLRFVQKRGNSTVYE WRTGTE PS WERPHLEEL 
PEQVAEDAI DWGDFGVEAVSEGTDSG I SAEAAG I DWG I FPESDS 
KDPGGDGIDWGDDAVALQITVLEAGTQAPEGVARGPDALTLLEY 
TBTRNQFLDELHELEIFLAQRAVELSEEADVLSVSQFQLAPAIL 
QGQTKEKMVTMVS VLEDLIGKLTSLOLQHLFMI LAS PRYVDRVT 
EFLQQKLKQSQLLALKKELMVQKQQEALEEQAALEPKLDLLLEK 
TKELQKLIEADISKRYSGRPVNLMGTSL 




3413 


3442 


EPSSWTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVFANGTL 

WKSVTDKDAGDYLCVARNKVGDDYWLKVDVVMKPAKI EHKEE 

NDHKVFYGGDLKVDCVATGLPNPEISWSLPDaSLVNSFMQSDDS 

GQRTKRYWFNNGTLYFNEVGMREEGDYTCFAENQVGKDEMRVR 

VKWTAPATIRNKTCLAVQVPYGDWTVACEAKGEPMPKVTWLS 

PTNKVIPTSSEKYOIYQDGTLLIQKAQRSDSGNYTCLVRNSAGE " 

DRKTVWIHVNVQPPK1NGNPNPITTVREIAAGGSRKLIDCKAEG 

IPTPRVLWAFPEGWLPAPYYGNRITVHGNGSLDIRSLRKSDSV 

QLVCMARWEGGEARLIVQLTVLEPMEKPIFHDPISEKITAMAGH 

TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMLH 

rSGLSSVDAGAYRCVARNAAGHTERLVSLKVGLKPEANKQYHNL 

VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 

LDNGTLTVREASVFDRGTYVCRMETEYGPSVTSIPVIVIAYPPR 

ITSEPTPVIYTRPGNTVFCLNCMAMGIPKADITWELPDKSHLKAG 

VQARLYGNRFLHPQGSLTIQHATQRDAGFYKCMAKNILGSDSKT 
TYIHVF 


6430 


1946 


602 


RTRVSTGLRRTLLWSEAVGASSTRGDTGIPGSGEGGAGPGGGEG 
AMLEAMAE PS PBDP PPTLKPETQPPEKRRRTI EDFNKFCS FVLA 
YAGYIPPSKEESDWPASGSSSPLRGESAADSDGWDSAPSDLRTI 
QTF VKKAKSS KRRAAQAG PTQPG P PRS T FS RLQA PDS ATLLE KM 
KLKDSLFDLDGPKVASPLSPTSLTHTSRPPAALTPVPLSQGDLS 
KPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 
KRKLKKAERGDRLP PPGPPQAPPSDTDS EEEEEEEEBEEEBEMA 
T WGGEAP VPVLPT P PE APR PP AT VH PEGVP PADS ES KEVGS TE 
TSQDGDASSSEGEMRVMDEDIMVESGDDSWDLITCYCRKPFAGR 

PMIECSLCGTWIHLSCAKIKKTNVPDFFYCQKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMODGRKGGAYAOKMI?ATTar , \/r , S — 

LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 

RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLBAAKPEPV 

IEEVDLANLAPRKPDWDLKRDVAKKLEKLKKRTQRAIAELIRER 

LKGQEDSLASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
DYSDQEVLQTLTKFCFPFYVDSLTVSQVGQPTFTFVLTDIDSKQR 
FGFCRLSSGAXSCFCILSYLPWFEVFYKLLNILADYTTKRQENQ 
WNELLETLHKLPI PDPGVSVHLS VHS Y FTVPDTRELPSI PENRN 
LTEYFVAVDVNNMLHLYASMLYERRILIICSKLSTLTACIHGSA 
AML YPM YWQHVYI PVLPPHLLDYCCAPMPYLIGIHLSLMEKVRN 
MALDD VV I LNVDTNTLE T PFDDLQS LPND V I S S LKNRLK K VS TT 
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I Amino acid segment containing signal peptide" 
(A»Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

1 P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, '-Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6433 






TGDGVARAFbKAQAAFFGSyRNALKIEPBBPITFCEKAFVSHYR 
SGAMRQFLQNATQLOLFKQFIDGRLDLUJSGEGFSDVFZEEIfJM 
GBYAGSDKLYHQWLSTVRKGSGAI LNTVKTKANPAMKTVYKFDI 
AENGCAPTPEEQLPKTAPSPLVEAKDPKLREDRRPITVHFGQVR 

PPRPHWKRPKSNIAVEGRRTSVPSPEQNTIATPATLHILQKSI 
| TKFAAKFPTRGWTSSSH 


6434 


1524 


404 


AP VTKRKEVF akds kgsaldagrdpkrpalpetlcesgwasnta 

PTTPPQPGWCLCGKDFKSSCQTPGREKERRIiATMHGSCSFLMLL 
LPLLLLLVATTG PVGALTDEEKRLMVELHNI>YRAQVS PTAS DML 
HKRWDEELAAFArCAYARQCVWGHNKERGRRGENLFAlTDEGMDV 
PLAMEEWHHEREMYNLSAATCSPGQMCX3HYTQVVWAKTERIGCX3 
SHFCEKLQGVEETNIELLVCMYEPPGNVKGKRPYQEGTPCSQCP 
SGYHCKNSLCEPIGSPEDAQDLPYLVTEAPSFRATEASDSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGLIjLLPPLVLAGIF 


6435 


40 


2002 


MPQLNFGMADPTQMGGLSMLLLAGEHALGTPBVFSGTCRPDVSE 
SPELRQKSPLFQFAEISSSTSHSDASTKQCQTSALFQFAEISSN 
TSQLGGAEPVKRCGKSALKQLAEMCLASEGMKMEESKLIKAKES 
DGGR I KE LEKG KE E KE I KME KTD ETR JjQKEAE FEKS AKENLRDS 

KELRNFEALQIDDIMAIKMEDPKEIRKEELEEDHKCSHFPDFSY 
SASSKII ISDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 
KSKMDRHGNDKSTPKKTCKKRQSSESDIESVIYTIEAVAKGDWG 
IEKLGDTPRKKVRTSSSGKGSILDAKPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFrsiSASKNISGETPEGIKAEPLTPMEDALPPS 
I>SGQ AKPEDSDCHR Kl ETCGS RKS ERS CKG ALY KTL VS EGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 
TKPKEDCLLGSAKLDEEFEKKFNSLPQYSPVTFDRKCVPVPRKK 
K KTGNVS SEPTKTSKGSGDKW SNKQL FLDA I H PTEA I FS EDRNT 

MEPVHKVKNIPSIFNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQRAGYKHEEVLWMTNIWNNCGGVYLKQLRHTAMTNA 


6436 


2227 


657 


ALQRDAAAAYAHPEYEERFLQEETVSQQINSIELLQTRPLALPE 
VVKS QRPLQRQVHLRGR PASQPTVIRGI TYYKAKV SE E END I E E 
QQDEFFSGDNGVDLIiIEDQLLRHNGLMTSVTRRPAATRQGHSTA 
VTSDLNARTAPWSSALPQPSTSDPSIANHASVGPTLQTTSVSPD 
PTRESVLQPSPQVPATTVAHTATQQPAAPAPPAVSPREALMEAM 
HTVP VPPTTVRTDS LG KDAPAGRGTTPAS PTLSP EBEDD IRNVI 

GRCKDTLSTITGPTTQNTYGRNEGAWMKDPLAKDERIYVTNYYY 
GNTLVEFRNLENFKQGRWSNSYKLPYSWIGTGHWYNGAFYYNR 
AFTRNI I KYDLKQR YVAAWAMLHDVAYEEATPWRWQGHSDVDFA 
VDENGLWLIYPALDDEGF5QEVIVLSKLNAADLSTQKETTWRTG 
LRRNFYGNCFVI CGVLYAVDS YNQRNANI S YAFDTHTNTQI VPR 
LLFENEYFYTTQIDYNPKDRLLYAWDNGHQVTYHVIFAY 


6437 


1295 


341 j 


GACRPPVRQDPDSGPDYEALPAGATVTTHMVAGAVAGILEHCVM 
YPI DCVKTRMQS LQPDPAARYRNVLEALWR 1 1 RTEGI*WR PMRGL 

NVTATGAGPAHALYFACYEKLKKTLSDVI HPGGNSH I ANGAAGC 

VATLLHDAAMNPAE WKORMOMYWQ PVHDwnnfDRTnurt*™.^ 
"» * v v i\.yiu*iuro i uo tr I nK V i J0CVKAVWQNEGAG 

A FYRS YTTQLTMNV P FQA I HFMTYE FLQE HFNPQRR YN PS SHVL 
SG ACAGAVAAAATT PIiDVCKTLLNTQESl. ALNSHI TGH I TGMAS 
AFRT VYQ VGG VTAY FRG VQ ARVI YQ I PSTAI AWS VYE F FKYL I T 
KRQEEWRAGK 




1828 


360 . 

1 
J 


PPAPAPPASPARHVTRTARGHLEGGSRAPPLLQAVFLQIKNMVK 
LIHTliADHGDDVNCCAFSFSLLATCSLDKTIRLYSLRDFTELPH 
S PLKFHT YAVHCCCFS PSGH ILAS CS TDGTTVLWNTENGQMZiAV 
Q P SGS P VR VCQFSPDS TCLASGAADGTWLWNAQS YKL YRCG 
SVKDGSLAACAFS PNGSFF VTGSSCGDLTVWDDKMRCLKSEKAH 
DLG I TCCDFS SQ PVSDGEQGLQFFRLAS CGQDCQVKI W I VS F TH 
t LG FELK YKS TLS GHCAP VLACAFS HDGQMLVS GS VDKS V I VYD 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=*Phenylalanine, G=Glycine, 
H=Histidine. I=Isoleuc-inA ff-rve<na 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


" 6438 






TNTBNILHTLTQHTRYVTTCAPAPNTLLIATGSMDKTVNIWQFD 
LBTbCQARSTEHQLKQFTEDWS EEDVSTWLCAQDLKDLVG I FKM 
NNIDGKELLNXiTKESLADDLKIESLGLRSKVLPTCTPPT DTin/vc 
LS SGI PDE P I CP I TRELMKDP V I ASDGYS YE KE AMENWD P AKRN 
RTSPP 




109 


901 


EVQILRAKMFQTGGLIVrYGLLAQTMAQFGGljPVPI^QTIiPL^' 
NPALPLSPTGltAGSLTNAL^NRT.T.Qrn t^yt ttktt nunrr 

GGTSGGLLGGLLGKVTS VI PGLNNI IDI KVTDPQLLELGLVQSP 
DGHRLYVTI PLGIFCLQVNTPLVGASLLRLAVKLDITAEILAVRD 
KQER I H LV LGDCTHS PGS LQI S LLDGLGPLP I QGLLDS LTG I LN 
KVLPELVQGNVCPLVNEVLRGLDITLVHDIVNMLIHGLQFVIKV 


6439 
6440 


23 


412 


SIQTASAITTEMASQSQGIQQLLQAEKRAAEKVADARXRKARRL 
KQAKEEAQMEVEQYRREREHEFQSKQQAAMGSQGNLSAEVEQAT 
RRQVQGMQSSQQRNRERVbAQLLGMVCDVRPQVHPNYRISA 




3 


517 


RARWNSDMGDLPGLVRLSIAl^IQPNDGPVFYKVDGQRFGQNRT 
i jmjLi i ijia js x JvVc.VKX K PS TLQVEN I S IGGVLVPLE LKSKE PDGD 

RVWTGTYDTEGVTPTKSGERQPIQITMPFTDIGTFETVWQVKF 
YNYHKRJDHCQWGSPFSVIEYECKPNETRSLMWVNKESFL 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDALGY 

RNVCiCENSTVGMKIQEELQRSGGLDHLVLSPGEWPVSDNTIMHI 

ATAEALTTDYWCbDDLYREMVRCYVE I VEKLPERRPDPATI EGC 

*UUA,fNn i JjLiAWH I PFNEKGSGFGAATKAMCIGLRYMKPERIjET 

LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 

MLRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 

DSENKAIFPDNYDAEEREKTYRKWSSEGRGGRRGHDAPMIAYDA 

LLAAGNSWTELCHRAMFHGGESAATGTIAGCLFGLIiYGLDIiVPK 
GIj YODLEDKE ICTi PDT /3&aT . v p t .qtb" u v 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGALAGLKTVSS YS~ 

LQRQSLLDMSLVKLQLCHMLVEPNLCRSVLIANTVRQIQEEMTQ 

DGTWRTVAPQAAERAPLDRLVSTEILCRAAWGQEGAHPASGLGD 
GHTQGPVSDLCPVT<5AnaPRPT^QQnuPMrw^nnt?xmi^rt»-iTi^rt-r ~ 
W r v ouiA-r v i &HUH^KttiAibbAWiiMlA>PHENRGSFHKSLD 

QIFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSS S CKSDLGELDHWE ILVET 


6443 


2 


555 


MASPAASSVRPPRPKKEPQTLVIPKNAAEEQKLKLERLMKNPDK 
AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHLRRR 
EYQRQDYMDAMAEKQKLDAEFQKRLEKNKIAAEEQTAKRRKKRQ 
IU^KBKKIjIiAKKMKLEOKKOEGPGOPKFOR<3QC;oairacr''rc'C'i7E'r> 
VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPIPEPKPGDLIEIPRPFYRHWAIYVGDGYWHLA 
PPSEVAGAGAASVMSALTDKAIVKKELLYDVAGSDKYQVNNKHD 
DKYSPLPCSKIIQRAEELVGQEVLYKLTSEMCEHFVNELRYGVA 
RSDQVRDVI IAAS VAGMGLAAMS L I G VM FS RN KRQKQ 


6445 


2 


753 


AGAAGAAGAARSPRPQAHTKGVRGLPSRRRSPDCGRMELAAGSF 
S E EQFWEACAELQQ P ALAGAD WQLL VETS G I S I YRLLDKKTG L Y 
EYKVFGVLEDCSPTLLADIYMDSDYUKQWDQYVKELYEQECNGE 
TWYWE VKY P FPMSNRD YVYLRQ RRDLDM EG RKI HVI LARS TSM 
PQLGERSGVIRVKQYKQSLAIESDGKKGSKVFMYYFDNPGGQIP 
SWLINWAAKNGVPNFLKDMARACQNYLKKT 


6446 


1 


1651 


KCPTKb'PPPDTPGSRGTTAMCSLASGATGGRGAVENEEDLPELS 
DSGDEAAWEDEDDADLPHGKQQTPCLFCNRLFTSAEETFSHCKS 
EHQFNIDSMVHKHGLEFYGYIKLlNFIPJiKNPTVEYMNSIYNPV 
P W E XEE YLK PVLEDDLLLQFD VEDL YEP VS VP FS YPNGLSENTS 
WE KLKHMEARALS AEAALARAR E DLQKMKQFAQDF VMHTDVRT 
CS S STSVI ADLQED EDG VYFS S YG HYG I HE EMLKDK I RTE S YRD 
FI YQNPH I FKDKWLDVGCGTG I LSMFAAKAGAKKVLGVDQSE I 
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Ammo acxd segment containing signal pep t id* 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E=~ 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine N-flcnaraaina 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine , V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide h^IpM^™ 
\=possible nucleotide insertion) 


6447 






bYQAMDIIRLNKLEDTITLIKGKIEEVHLPVEKVD^HSEWMGY 
FLLFESMLDSVLYAKNKYLAKGGSVYPDICTISLVAVSDVNKHA 
DRIAFWDDVYGPKMSCMKKAVlPEAWPVT.novTT tcpd^tvu 

IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
R WFSTG PQS TKTH WKQT VFLLE KPFS VKAG EALKG KVT VHKN K 
KDPRSLTVTLTLNNSTQTYGLQ 




1554 " 


1068 


RLGPABWHLSGPCHATl^AANRGRALGVRAAWRGAPLCQRVMMP 
SRTNLATGIPSSKVKYSRLSSTDDGYIDLQFKKTPPKIPYKAIA 
LATVLFLIGAFLIIIGSLLLSGYISKGGADRAVPVLIIGILVFL 
PG F YH LR I A Y Y AS KG YRG YS YDDI PDFDD 


6448 
6449 


74 


S59 


GQVLSHCYHYRSSRWRRGGLSRGRGAGVMALVPYEETTEFGLQK 
FHKPLATFSFANHTIQIRQDWRHLGVAAWWDAAIVLSTYLEMG 
^ » i^ijrt<ji\o>\v ciaj/^j i ^1jV(jI VAALI^CRI RYERI3KN FLAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 




597 


1876 


KYGVCENLRKbElTGVSCRDN/VAKLLHRVRHILGLWQPDIGPYG 
GLLNWVDGLFIIGWMYLPPHDPHVDDPMRFKPLFRlHbMERKA 
ATVECMYGHKGPHHGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
r k. i w JjK h b w UKTJbED I FH EHMQEL I LMKF I YTS Q YDNCLT YRR I 

YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTK1TG 
DPNIPAGQQTVEIDLRHRIQLPDIiENQRNFNELSRIVLEVRERV 
RQEQQEGGHEAG EGRGRQGPRES Q P S PAQPRAE APS KG P DGTPG 
EDGG E PG DAV AAAEQP AQ CGQGQ P F VLP VGVS SRNEDYPRTCRM 
CF YG TGL I AGHG FTS PE RTPG VF I LFDEDRFG FVWLELKS FSLY 


6450 


848 ■ 


269 


FVPAPRTVSGKRSLPGEWEERGEGEQRTCREFSGNGGRAVEAAR" 
MRLLCGLWLWLSLLKVLQAQTPTPLPLPPPMQSFQGNQFQGEWF 
VLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWNAMTRGQHC 
DTWS YVLI PAAQPGQFTVDHRVWTHEQAGR PQDQPAGQELVAAS 
RDAGPVHLPGQSSGPIiG 


6451 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 
I LPGL FLGPY S S AMKS KL P VLQKHG I THI I C I RQNT E ANF I KPN 

FQQLFRYLVLDIADNPVENIIRFFPMTKEFIDGSLQMGGKVLVH 

GNAGISRSAAFVIAYIMETFGMKYRDAFAYVQERRFCINPNAGF 
VHQLQEYEAIYLAKIjTIOMMQPTOTPdct c\njor"nv-cT irnrmir, 

EEDDFGTMQVATAQNG 


6452 


1 


652 


RTRGESS^EPIJUXypLKCSGPRAKV^AVLLSIVLCtVTLFLLQ - 
LKFLKPKINSFYAFEVKDAKGRTVSLEKYKGKVSLWNVASDCQ 
LTDRNYLGLKELHKEFGPSHFSVLAFPCNQFGESEPRPSKEVES 
FARKNYGVTFPIFHKIKILGSEGEPAFRFLVD^^KiTPDDWKrjnjif 
YLVNP EGQWKFWRPEE P IEVIRPDIAALVRQV 1 1 KKKEDL 


6453 


827 


223 


HRRWIiPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSG"SC~" 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTCKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6454 
6455 " 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADS SG TAEGGSGVAS PAQAD KAEL 




1042 


173 


KVHbATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL" 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLYIEI KRGVTEDDGRPI YALVNLATTS IS KMATDFAENELDLF 
RKALELIIDSETGFASSTNILNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVK1CNICHSL 
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Amino acid segment containing signal peptide 
(A«Alanine, C^Cysteine, D^Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Ly3xne, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCBTCX3IRMHLPCVAKYFQSNAEPRCPHCNDYWPHEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


RPQSRSISMWRNSLLQVSSGLRWLRVCAMVDILGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAIPKLVIVKQNGEVITNKGRKQIRERGliACFQDWVEAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFPDFDKKIPV 
KLF PLPLLYVGNHI S G LS S TS KL S L PMFTVLRKFT I PLTLLLET 
IILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI ISVSTG 
DLQQ ATEFNQWKNWF I LQFLLS CFLGFLLMYSTVLCS YYNSAL 
TTAWGAI KNVS VAYIG I LIGGDYI FSLLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIM I LYVSKLNK I IHFPDFDKKIPV 
KLF PLPLL WGNH I SGLS S TS KLS L PMFTVLR KFT I PLTLLLET 
IILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFM 1 1 PTLI I S VSTG 
DLQQATEFNQWKNWF ILQFLLSCFLGFLIMYSTVLCS YYNSAL 
TTAWGAI KNVS VAYIG I LIGGDYI FSLLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 




PTTGFPVTNFPWNWPDGKP PI MI LYVSKLNK I IHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKQYSLNI ILSVFAI I LGAPIAAGSDLAFNLEGYI FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI ISVSTG 
DLQQATE FNQWKNWFI LQFL LS C FLG FLLM YST VLCS YYNSAL 
TTAWGAI KNVSVAYIG I LIGGDYI FSLLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKL.NKIIHFPDFDKKIPV 
KLFPLPLLYVGNH I SGLSSTSKLSL PMFTVLRKFT I PLTLLLET 
IILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI ISVSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAI KNVS VAYIG I L IGGDY I FS LLNFVGLNICMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6461 


1653 


360 


IiQQRTLRITAVGQTHPIAWMAWePSLGAFYGPASFITFVNCMYF 
LS IFIQLKRHPERKYELKEPTEEQQRLAANENGE INHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATS LS FS AF F WHHCVNRE DVRLAW I MTCCPGRS S 
YSVQVNVQPPNSNGTNGEAPKCPNSSAESSCTNKSASSFKNSSQ 
GCKLTNLQAAAAQCHANSLPLNSTPQLDNSLTEHSMDND I KMHV 
APLE VQFRTNVHS S RHHKNRS KGHRASRLTVLRE YAYDVPTS V E 
GSVQNGLPKSRLGNNEGHSRSRRAYLAYRERQYNPPQQDSSDAC 
STLPKS S RNFEKP VSTTS KKDALRKPAWE LENQQKS YGLNLAI 
QNGPI KSNGQEGPLLGTDSTGNVRTGLWKHETTV 


64 62 


3 


773 


SEELDREKKLKJSDS PRKTPNKESG VPSL PVSLTS I KEEPKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YLHAYPYPQMYDPSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYG 
KMSGREETEKVNTS PS VNTKTTTES KALDLLQQHANQYRS KS PA 
PVEKATAEREREAERERDRHS PFGQRHLHTHHHTHVGMGYPL I P 
GQYD P FQG LTSAAL VASQQVAAQAS ASGM FPGQRRE 


6463 


2 | 


350 


V I LC I LGG W I FKNADRSMEKKKGE PRTRAE ARP WVDE DLKDS S D 
LHQAEEDADE WQES E ENVEH I P FSHNHY PEKEMVKRS QEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 j 1154 


G I LRQKEREERNRI HKKE I LFLEHLLW PS EMSS LSGKVQTVLG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«=Isoleucine # K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=*Arginine, 
S=serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\*pos3ible nucleotide insertion) 








LVEPSKLGRTLTJIBKLAMTFDCCyCPPPPCQEAISKBPIVMKNIi 
YWIQKNAYSHKENIiQLNQETEAIKEELLYFKANGGGALVENTTT 
Gl SRDTQTLKRLAEETGVHI ISGAGFYVDATHSSETRAMSVEQL 
TDVLMNE I LHGADGTS IKCGIIGE IGCSWPLTESERKVLQATAH 
AQAQLGCPVI I HPGR S SRAP FQ I IRILQEAGADISKTVMSHLDR 
TIUDKKELLEFAQLGCYLBYDLFGTELLHYQLGPDIDMPDDNKR 
IRRVRLLVEEGCEDRILVAHDIHTKTRLMKYGGHrrvQUTT tt\ttaj- 
P KMLLRG I TENVLDK I L I ENPKQWLTFK 


6465 


126 


1396 


KMTVFFKTLRNHWKKTTAGLCLLltVGGHWLYGIOiCDWLLRRAAC 
QEAQVFGNQLI PPNAOVKKATVFL.MPAarwrvivDTT ctvvtts * r»r 

LHLSGMDVTIVKTDYEGQAKKLLELMENTDVIIVAGGDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTijFAESGNKVQHI 
TDArLAIVKGETVPLDVLQIKGEKEQPVFAMTGIiRWGSFRDAGV 
KVS KY WYLE P LK I KAAH FFSTL KEW PQTHQAS I S YTG PT ERP PN 
EPEETPVQRPSLYRRILRRLASYWAQPQDALSOEVSPEVWKDVQ 
LSTIELSlTTRNNOIiDPTSFOEDFLNICIEPDTISKGDFITIGSR 
KVRNPKLHVEGTECLQASQCTLLI PEG AGGS FS I DSEE YEAMP V 
EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPADMGRRKSKRKPPPKKKMTGTLETQFTC 
P FCNHE KS CD VKMDRARNTG V I S CTVCLEE FQT P I TYLS E P VD V 
YS D W I D ACEAANQ 


6467 


301 


2571 


GBLRVLALAHGELACHAVLTASLLSLRSRLMDSDMDYERPNVET 
I KCVWGDNAVG KTRL I CARACNATLTQYQ LLATHVPTVWAI DQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFSIANPNSLHHVKTMWYPEIKHFCPRAPVILVGCQLDLRY 
ADLEAVNRARRPLARP I KPNE ILPPE KGREVAKELGI P YYETS V 
VAQFGI KDVFDNAI RAALI SRRHLQ FWKSHLRNVQRPLLQAPFL 

PPKPPPPIIWPDPPSQ*?RFr , DaUT.T.PntiTr'3vrv^rTr «»r 

FAHKIYLSTSSSKFYDIiFLMDLSEGELGGPSEPGGTHPEDHQGH 
SDQHHHHHHHHHGRDFLLRAASFDVCESVDEAGGSGPAGLRAST 
SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPLTYKSR 
LMWVKMDSSIQPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFDLRMMVANILNNEAFMNQEITKAFHVRRTNRVKECLAKGT 
FSDVTFILDDGTISAHKPLLISSCDWMAAMFGGPFVESSTREW 
FPYTS KSCMRAVLEYLYTGMFTSSPDLDDMKLT ILANRLCLPHL 
VALTEQY TVTGLMEATQMMVD I DG DVLVFLELAQ FHCA YQLAD W 
CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 

2DHYQRARKEREKEDYLHLKRQPKRRWLFWNSPSSPSSSAASSS 
SPSSSSAW 


646B 


3 


1374 


DAWAGTNMAALAP VGS PASRGPRLAAGLRLLPMLGLLOLLAEPG 
IX3R VHHIJU^KDD VRHKVHLNT FGFFKIX3 YMVVNVS S1»S LNE P ED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLIilLDI 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
OTQKTQDGGKS KRS TVDS KAMGE KS FS VHNNGGAVS FQFFFN I S 

TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GBIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
P FTKS LS LVFHAI D YHY I SS QG FPI EGWAWYY I THLL KG ALL F 

ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 
GKGKFS RAHFVLLS LL 


6469 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRLLPMLGLLQLLAEPG 
IjGRVHHIJUjKDDVRHKVHLNTFGFFKDGYMVVNVSSLSLNEPED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILJCKQSVSVTLLILDI 
SRSEVRVXS PPEAGTQLPKI IFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
rDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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nucleotide 
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corre spondi ng 
to first 
amino acid 
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amino acid 
sequence 


Predicted encT 
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residue of 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cy B teine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine # 
H=Histidine, I^Isoleucine, K=Lvsine 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline f 0=Glutaraine f RsArginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, '-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEIPLPKLYISMAFFFFI^GTIWIHILRKRRNDVFKIHWI.MAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
I TI AL I GTGWAF I KH I LS DKDKKI FMI VI PR RVLANVAY HIES 
TEEGTTE YGLWKDSL FLVDLLCCGA1 LFP WWS I RHT^PttcaTn 
GKGKFSRAHFVLLSLL 


6470 


2726 


1437 


AAASG VS S RADAP VLAQS PAS AGNGRPS TP R VPGS RRH PS AP R S 
G PL PRE DG C RT PG PQLLPLPGALLR PRTLLS S AAETGRS RHPDT 1 
QHPSSGGRCRGGTES PS SAAGR PASMAEAEEDCHSDTVRADDDE 
BNESPABTDLQAQLOMFRAQWMFELAPGVSSSNLENRPCRAARG 
SLQKTSADT KGKQEQAKEEKARELFLKAVE EEQNGALYEAI KFY 
RRAMQLVPDi EFKI TYTRSPDGDGVGNSYI EDNDDDSKMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
PYTSWREMFLERPRVRFDGVYISKTTYIRQGEQSLDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPO^TVPRT.PTo 


6471 


1750 


299 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GPRNKKRGWRRLAQEPLGLEVDQFLEDVRLQERTSGGLLSEAPN 
EKZjFFVDTGSKEKGLTKKRTKVQKKSLLLKKPLRVDLILENTSK 
VPAPKDVLAHQVPNAJCKLRRKEQLWEKIjAKQGE lprevrraqar 

llnpsatrakpgpqdtverpfydlwasdnpldrplvgqdeffle 
qtkkkgvkrparlhtkpsqapavevapagasynpsfedhqtlls 

AAHEVELOROKEAEKLEROIiATiPaTFAaaTnvcTirAT?r i^r^t t c 

ESDGEGEPGQGEGPEAGDAEVCPTPARtiATTEKKTEQQRRREKA 
VHRLRVQQAALRAARLRHQELFRLRGIKAGVALRLAELARRQRR 
RQARREAE ADKP RRLGR L KYQ APD I D VQLSS ELTDSLRTLKP EG 
NILRDRFKSFQRRNMIEPRERAKFKRKYKVKLVEKRAFREIQL 


6472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP" 
ARVDLQQQIMTI I DELG KAS AKAQNLS AP ITS AS RMQSNRHW Y 
ILKDSSARPAGKGAIIGFTKVRYK'IfT.Pirr.nnppauKic^rpriT rrr 

DFYIHESVQRKGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 
IiNKH YNLETTVP QVNNFV I F EGFFAHQHR PPAPS LRATR H S RAA 
AVDPTPAAPARKLPPKRABGDIKPYSSSDREFLKVAVEPPWPLN 
RAPRRATPPAHPPPRSSSLGWSPERGPLRPFVP 


6473 


22 


912 


SSAVEFVWEGEKMAAEPNKTEIQTLFKRLRAVPTNKACFDCGAK 
NPS WAS I T YG V FLC I DCS G VHRSLG VHLS FIRS TE LDSNWNW FQ 
LRCMQVGGNAHATAFFRQHGCTANDANTKYNSRAAQMYREKI RQ 
LGS AALARHGTDLWI DNMSSAVPNHSPEKKDSDFFTEHTQP PAW 
DA PAT E PSGTQQ P APS TES SGLAQP BHG PNTDLLGTS P KAS LEL 
KSS 1 1 G KKKPAAAKKGLGAK KGLGAQKVS SQS FSEI ERQAQVAE 
KLREQOAADAKKQAEESMVASMPJLAYQELOIDR 


6474 


3 


462 


LQRQRQH P AAAPAVP VRC FT FC FTDI V I M PKRKS PENTEG KDG S 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGK KE EKQEAG KEG TAP S ENG ETKAE E I H I SRSTVNVSTS RG TP 
PS TI*S VKGQI ETVRVKGTEN 


6475 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6476 


106 


1090 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMEVLKQRIAEETIL 
KSQVDKRFSAHYDAVEAELKSSTVGLVTLNDMKARQEALVRERE 
RQLAKRQH LEEQRLQQERQR EQEQRRERKRKI SCLS FALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQREKVKDEEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFIIARARGK 
SGPLFSFDVHDDVRLLSDATMEKDESHAGKWLRSWYEKNKHIF 
PAS RWBAYDPEKKWD KYTI R 
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SECT 
ID 
NO: 

6477 " 


Predicted 

beginning 

nucleotide 

location 

c o r re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide" 
(A=Alanine, OCyeteine, D=Aspartic Acid, B= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, ^Threonine, V=Valine, 
WnTryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




227 


915 


LQGHI>1GI[WSRPLSRFWEWGKNIVCVGRNYADHVREMRSAVL 
SEPVLFLKPSTAYAPEGSPII^PAYTRm^HKLELGVVMGKRCR 
AVPEAAAm^YVGGYALCIiDMTARDVQDECKKKGLPWTLAKSFTA 
SCPVSAFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMIFSIPY 
IIS YVSK 1 1 TLEEGD 1 1 LTGTP KG VG P VKENDE I EAG 1 HG LVSM 
TFKVEKPEY 


6478 " 


2 


1495 


**w**rw*^i*#*a AiJisMr^ijKJUitiijjJt.5SWKKQTTNIRKTF 
I FMEVLGSGAFSEVFLVKQRLTGKLFALKCI KKS PAFRDSSLEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGBLFDRILE 
RG V YTEKDAS L VI QQVLS A VKYLH ENGI VHRD L KP ENLL YLTPE 
ENS KIMI TDFGLS KMEQNG I MS TACGTPG YVAPEVLAQKP YS KA 
VDCWSIGVITYILLCGYPPFYEETESKLFEKIKEGYYEFESPFW 
DDISESAKDFICHLLEKDPNERYTCEXALSHPWIDGNTALHRDI 
YPSVSLQIQKNFAKSKWRQAFNAAAVVHHMRKLHMNLHSPGVRP 

EVENRPPRTOA'JPTQlJDCCDPT'T'TTPfcm^fMf^ifin' _ ____ 

o v an n. tr r o i a i r b b F Ji I T Z TE A P VLDHS VALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSLAAGPCGCC 
S SCLN I GS KGKS S YCS E PT LLKKANKKQN P KS E VM VP VKASGSS 
HCRAGQTG VCL I M 


6479 


3 


94 9 


SCRGPGWHPaggQAGAMELLSALSLGELALSFSRVPLPPVFDLS 
YFIVSILYLKYEPGAVELSRRIIPIASWLCAMLHCFQSYILA0LL 
LGE PL I D YFS NNS S I LLAS A VW Y L I F FC PLDL F Y KC VCFLP VKL 
IFVAMKEVVRVRKIAVGIHHAHHHYHHGWFVMIATGWVKGSGVA 
LWSNFEQLLRGVWKPETNEILHMSFPTKASLYGAILFTLQQTRW 
LPVSKAS LI FI FTLFMVSCKVFLTATHSHSS PFDALEGYI CPVL 
FGS ACGGDHHHDNHGGSHSGGG PGAQ HS AMP AKS KEE LS EGS RK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPDYLRSAKMTEVMMNTQPMEEIGLSPRKDGLSY " 
Q I F PDPS DFDRCCKLKDRLPS I WEPTEGBVESGELRWP PEE FL 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVIAGGTLAIPILAFVASFLLWPSALIRIYYWY 
WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPS ILMLHGFSAHKD 
MWLS WK FLPKNLHLVCVDMPGHEGTTRS SLDDLS I DGQVXR I H 
QFVECLKLNFOKPFHLVGTSMGGQVAGVYAAYYPS DVS S LWLVCP 
AGLQYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMJLQLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQ 
NMDKI KVPTQ 1 1 WG KQDQ VLDVS GADMLAKS I ANCQ VE LLENCG 
HSVVMERPRKTAKLI IDFLASVHNTDNNKKLD 


6482 


2517 


568 


epvskvsqsrrkagvptanif;KsqaveaamanvpwaevceKfqa 

ALALSRVELHKNPEKEPYKSKYSARALLEEVKALLGPAPEDEDE 

RPEAEDGPGAGDHALGLPAEWEPEGPVAQRAVRLAVIEFHLGV 
NHIDTEELSAGEEHLVK , CT.'RT,T.I?I>VDT.cimr»TaT rTn»^MMrn>r 

LWS EREE I ETAQAYLESS EALYNQ YMKE VGS PPLDPTERFLPEE 
EKLTEQERS KRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 
KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 
FGQTGKISATEDTPEAEGEVPELYHQRKGEIARCWIKYCLTLMQ 
NAQLSMQDN IGE LDLDKQS ELRALR KKELDE EES I RK KA VQFGT 
GELC3DAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 
I DG YVTDH I E WQDHSALFKGLAF FETDMERRCKMHKRR IAMLE 
PLTVDLNPQYYLLVNRQIQFEIAHAYYDMMBLKVAIADRLRDPD 
SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLAKFRVARLYGKIITADPKKELENLATSLEHYKFIVDYCEKHP 
EAAQEIEVELELSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCGLRARAPLSANGREARAWEQRLABFRAARKRAGLAAQP 
PAASQGAQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 

slvqeaaqpqgstsetpwntaiplpscwdqsfltnitflkvllw 
lvllglfvelefglayfvlslfywmyvgtrgpeekkegeksays 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
co ixrst 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide " 
location 
c orr e spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segment containing signal oeptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, KeLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6484 


201 


965 


"VFNPGCBAIQGTLTAEQLERBLQLRPLAGR 

QLAVKTKMSGLRPGTQVDPKIELFVXAGSOGESIGNCPFCQRLF 
MILWLKGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 
DFIKIEEFLEQTLAPPRYPHLSPKYKESFDVGCNLFAKFSAYIK 
NTQKEANKNFEKSLLKEPKRLDDYLNTPLLDEIDPDSABEPPVS 
RRLFLDGDQLTIADCSLLPKLNI IKVAAKKYRDFDIPAEFSGVW 
RYLHNAYAREEFTHTCPEDKE I ENT YANVAKQKS 


6485 


6 


1091 


fvdlvraveflpcpdsqklekecqsseeSmgsnsmrsileede'e 

DEEPPRVIiLYHEPRSFEVGMLWJHKHKKYPFWPAWKSVRQRDK 

KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 

NQDIGWCVSLITDYRVRLGCGSFAGSFLBYYAADISYPVRKSIQ 

QDVLGTKLPQLSKGSPEEPWGCPLGORQPCRKMLPDRSRAARD 

RANQKLVBYIGKAKGAESHLRAILKSRKPSRWLQTFLSSSQYVT 

CVETYLEDEGQLDLVVKYIjQG VYQ E VGAKVLQRTNGDR I R FI LD 

VLIiPEAIICAISAGDEVDYKTAEEKYIKGPSLSYREKEIFDNQL 
LEERNRRRR 


6486 


10 


581 


lvlqaggahlspsrvtqgi y YMLAFSEMPKPPDYSELSDSLTLA 

GGTGRFSGPLHRAWRMMNFRQRMGWIGVGLYLLASAAAFYYVFE 

ISETYNRLALEHIQQHPEEPLEGTTWTHSLKAQLLSLPFWVWTV 

IFLVPYLQMFLFLYSCTRADPKTVGYCIIPICLAVICNRHQAFV 
KAS N0I SRLQli I DT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCA5!T~ 
YYNGCI FHRNI KGFMVQTGDPTGTGRGGNS I WGKKFEDEYSEYL 
KH^mlGWSMANNGPNTNGSQFFITYGKQPHLDMKYTVFGKV^D 
GLETLDELEKLPVNEKT YRPLNDVHI KDITIHAN PFAQ 


6488 


878 


241 


TALQEFGTSGPPLSLRFALPSGTGRFKPLFGARGPSWPPSPRVP" 
MEPPNLYPVKLYVYDLSKGLARRLSPIMLGKQLEGIWHTSIWH 
KDE F F FGSGG IS 3 CPPGGTLLGPP DS WDVGS TE VTEE I FL E YL 
SSLGESLFRGEAYNLFEHNCNTFSNEVAQFIiTGRKIPSYITDLP 
SEVLSTPFGQALRPLLDSIQIQPPGGSSVGRPNGQS 


6489 


1457 


375 


KVAKMATALS E E E LDNED Y YS LLNVRREASS E EL KAA YR RLCML 
YHPDKHR DPE L KS QAERL FNLVHQAYE VLSD PQTRAI YD I YG KR 
GLEMEGWEWERRRTPAElREEFERLQREREERRLQQRTNPKGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPQIEINKMHISQSIEAPL 
TATDTAI LSG3LSTQNGNGGGS INFALRRVTSAKGWGELEFGAG 
DLQGPLFGLKLFRNLTPRCFVTTNCALQFSSRGIRPGLTTVIAR 
NLDKNTVGYUJWHCSSPLLQVQRPHRNTRACAPEPSFRPFLHVP 
1WDAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCE VWLGYGPRAAAAAAATVLFGGAGPTETMFVARS IAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGS VWRVTWAHPE FGQVLAS CSFDRTAAVWEE I VG ESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGIVRIYS 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
UbiPWAMAKVOI FEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQVWRVSWNITGTVXASSGDDGCVRLWKANYMDNWKCTGIL 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSLNGSSAGRKHS 


6491 


3 


11B3 


HEAGCEVWUiYGPRAAAAAAATVLFGGAGPTETMFVARSIAADH" 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGS VWR VTWAH PE FGQVLAS CS FDRTAAVWEEI VGES NDKLRG " 
QS HWVKRTTLVDSRTS VTD VK FAPKHMGLMLAT CS ADGI VR I YE 
APDVMNLSQWSLQHErSCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFI 1 1 LA I ATKDVR I FTLKP VR EG2LTSS GG PTKFE I H I VAQ FD 
NHNSQ VWR VS WNI TGTVLAS S GDDGCVRLWKANYMDNW KCTG I L 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine. 
H=Histidine, I=Isoleucine, K=Lyslne, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
WoTryptophan, Y-Tyrooine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6492 


34 


2573 


KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSL.NGSSAGfeKHS 

I P FLKS CCCCCLFDF PP P PLDQVQE E E CEVER VTEtf G T P KP FR K 

FDS VAFGES QS EDEQ FENDLETD P PNWQQLVS RE VLLGL KPCE I 

KRQEVINELFYTBRAHVRTLKVLDQVFYQRVSREGILSPSELRK 

1 FSNLEDI LQ/LH I GLNEQMKAVRKRNETS VI DQ IGEDLLTWFSG 

PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 

PLCRRLQLKDI I PTQMQRLTKYPLLLDNIATYTEWPTEREKVKK 

AADHC^QILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEYPN 

VEELRNLDLTKRKMI HBGPLVWKVNRDKTIDLYTLLLE DI LVLL 

QKQDDRLVLRCHSKILASTADSKHTFSPVIKLSTVLVRQVATDN 

KALFVISMSDNGAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 

LESTLlSSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLiK 

EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLLVQQLGLT 

EKSVQEDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSGEGHMP 

FRTGTGDIATCYSPRTSTESFAPRDSVGLAPQDSQASNILVMDH 

MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 

EEKDVNLRISGNYLILDGYDPVQESSTDEEVASSLTLQPMTGIP 

AVB S THQQQHS PQNTHS DGA ISP FT PEFL VQQRWG AME YS C FE I 

QSPSSCADSQSQIMEYIHKIEAOLEHLKKVEESYTILCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGSSTSDCMSKTLDSASAHFAASAWSAPVPSRSEVA - 
KEQNTGHNNING WQ PSGTS KTLYS TNMALSSS PG I S AVQL VRT 
VGHTTTNHLI PALCTS S PQTL P MNNS CLTNAVHLNNVS WS P VN 

VHINTRTSAPSPTALKLATVAASMDRVPKVTPSSAISSIARENH 
EPERLGLNGIAETTVAMEVT 


S494 


242S" 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPILMEKEEEGMLSPILAHGG 
VRFMWIKHNNLYLVATSKKNACVSLVFSFLYKWQVFSEYFKEL 
EEESIRDNFVIIYELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPRPPATVTNAVSWRSEG I KYRKNEVFLDVI BSVNLLVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFENDRTISFIPPDGEFELMSYRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGS VKWVPENSE I VWS I KS FPGGKE YLMRAHFGL 
PS VEAEDXEGKP PIS VKFEI P YFTTSGI Q VR YLK 1 1 EKSGYQAL 
PWVRYITQNGDYQLRTQ 




6495 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIKSASAWVtD 
LKGKVL I CRNYRGD VDM SEVBHFM P I LME KE E EGMLS P I LAHGG 
VRFMWIKHNNLYLVATSfCKNACVSLVPSFLYKWQVFSEYFKEI, 
EEESIRDNFVIIYELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPR P P ATVTNAVS WRSEG I KYR KNE VFLDVI ES VNLLVS AN 

GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFENDRTISFIPPDGEFELMSYRLNTHVK 
¥ ADiuionofixaiwiRftiujutRKKb 1 AJVNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSEIVWSIKSFPGGKEYLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
PWVRYITQNGDYQLRTQ 




6496 


247 


559 


LRAVSLLPLQLVLPEYSIHSLFCIMFLCAQEWLTLGLNVPLLFY 
HFWRYFHCPADSSELAYDPPWMNADTLSYCQKEAWCKLAPYLL 
SFFYYLYCMIYTLVSS 




6497 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKWLIGBSGV - 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
TAGLERYRAITSAYYRGAVGALLVFDLTKHQTYAWERWLKELY 
DHAEATIWMLVGNKSDLSQAREVPTEEARMFAENNGLLFLETS 
ALD S TNVE LAFE TVLKE I FAKVS KQRQNS I RTNAI TLGSAQAGQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknoum, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








B PG PGBKRACC I SL 


6498 


2636 


272 


SLRLCPWGTHIiAGPTTriRLSSLLALIaRPALPLILGLSLGCSLSL 
LRVSWIC^EGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VPyYRDPNKPYKKVLRTRYIQTELGSRERLLVAVLTSRATLSTL 
AVAVNRTVAHHFPRLLYPTGQRGARAPAGMQWSHGDERPAWLM 
S ETLRHLHTHFGADYDWPFI MQDDTYVQAPRLAALAGHLS INQD 

LYIiGRAEEFT(3Af3FfiARYPWV3T?f'iVT.f .c»er ttoi ddui nr>r«or> 

DI LSARPDEWLGRCL I DS LGVGCVSQHQGQQYRS FELAKNRDPB 
KEGSS AFLS AFAVHP VS EGTLMYRLHKRFS ALELERA YS E I EQL 
QAOIRNLTVLTPEGEAGLSWPVGliPAPFTFHSRFEVLGWDYFTE 
QHTFSCADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LRPLS RVE 1 LPMPYVTEATRVQLVLPLLVAEAAAAPAFLE AFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFIiGVKAAAAELERRY 
PGTRLAWLAVRAEAPSQVRLMDWSKKHPVDTLFFLTTVWTRPG 
PEVLNRCRMNAISGWQAFFPVHFQEFNPALSPQRSPPGPPGAGP 
D P P S P PGAD PS RG AP I GGRFDRQ A S AEGCF YNADYLAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CSPRLSEELYHRCRLSNLEGLGGRAQIiAMALFEQEQAKST 


6499 


3 


2040 


S CS ADT RPSGQ AW PTVGLRAAAGA FRTG S PLALG PET PQ VACLP 
GHPPVRPQVSGG PGAMPDPAAHLPFF YGS IS RAEAEEHLKLAGM 
ADGLFLLRQCLRS LGG YVLSLVHDVRFHHFP IERQLNGTYAIAG 
uru-urvAjcrtC.ijV»c«r i oKU f lAalitrKM ljKIUr(.NK rou uhPQPGVr DC 
LRDAMVRDYVRQTWKIjEGEAIiEQAI I SQAPQVEKLI ATTAHERM 
PWYHSSLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALSLIYG 
KTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYC 
LK E AC PNSS AS NASGAAAPTL P AH PS TLTHPQRR I DTLNS DGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
LXAD I ELGCGN FGS VRQG VYRMRKKQ I DVAIKVLKQGTEKADTE 
EMMREAQIMHQLDNP YI VRLI GVCQAEALMLVMEMAGGG PLHKF 
LVGKREE I PVSNVAE LLHQVS MGMK YLEE KNF VHRDLAARNVLL 
VNRH YAK I S DFGLS KALGADDS YYTARS AG KV^P LKW YAP E C IN F 
RKFSSRSDVWSYGVT>IWEALSYGQKPYKKMKGPEVMAFIEQGKR 
M EC P PE C P PEL Y ALMSDCW I YKWEDRP D FLTV EQRMRAC Y YS LA 
SKVEGPPGSTQKAEAACA 


6500 


1773 


726 


TGPTHASADAWGLVRSVTBV?CANYRGNPCAAALSCPQAVLDAGK 
MLSES S S FLKGVMLGSIFCAL ITMLGHIR IGHGNRMHHHBHHHL 
QAPNKEDI LKISEDERMELSKSFRVYCI I LVKPKDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFESINMDTNDMWLMMRKAYKYAFDK 
YRDQ YNW ? FLAR PTT FAI I EN LKYFLLKKDP SQP FYLGHT I KSG 
DLEYVGMEGGIVLSVESMKRLNSLLNIPEKCPEQGGMIWKISED 
KQLA VCL K YAG V FAENAEDADGKDVFNTKS VGLS I KEAMTYHPN 
QWEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


6501 


1 


570 


LVGMSGGGTETPVGCEAAPGGGSKKRDSLGTAGSAHLIIKDLGE 
IHSRLLDHRPVIQGETRYFVKEFEEKRGLREMRVLENLKNMIHE 
TNEHTLPKCRDTMRDSLSQVLQR LQAANDSVCRLQQR EQERKK I 
HSDHLVASEKQHMU3WDNFMKEQPNKRAEVDEEHRKAMBRLKEQ 
YAEME KDLAKFSTF 


6502 


213 [ 


il£o 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KSSEALE FMKRDLTE FTQWQHDTACTI AATAS WKEKLATEGS 
SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSG? 
AE P YDGTKARL YS LQS DPATYCNE PDG PP EL FDAWLS QFCLEE K. 
KGE I SELL VGS PS I RALYTKM VPAAVS HS EFWHR YFYKVHQ LEQ • 
EQARRDALKQRAEQSISEEPGWEEEEEELMGISPISPKEAKVPV 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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SBQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


~ Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amano acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
»=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, K=Methioiiine, N»Asparagine, 
P=Proline, Q=Giutamine, R=Arginine, 
S«Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPFARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTEEEVQKALSKVDASG 
BVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KSS 2 ALEFM KRDLT E PTQWQHDTACT I AATAS WKE KLATEGS 
SGATBKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGE I S ELLVGS PS I RAi YTKMVPAA VSHSEFWHRYF YKVHQLEQ 
EQARRDALKQRAEOS I S EE PGWEEE EEELMG I S PIS P KEAKVPV 
AKISTFFEGEPGPQSPCEKNLVTSVEPPAEVTPSESSESISLVT 
; QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTBEEVQMALSKVDASG 
EVS G PGGSEGSE PNG PGCESS PQPAQLSPQEGP CSCLR 


6504 " 
£S0S 


2131 


1294 


GKVC^VAHWVCLSILSPPPAGMKTPNAQEAEGQQTRAAAGRATC 
SANMTKKKV5QKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DR VAS SHI SD ANLANT I IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PlMKAWFYITYEKDPVLYMYQLLDDYKEGDIiRIMPESSESPPTE 
REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEARPSVYFIKF 
DDDFHIYVYDLVKKS 




2131 


1294 


GKVCLVAH WVCLS I LS P P PAGMKT PNAQEAEGQQTRAAAG RATG '" 

SANMTKKKVSQKKQRGRPSSOPCRNIVGCRISHGWKEGDEPITQ 

WKGT VLDQVP IMPS LYLVK YDG I DCV YG1»ELH RDERVLS LK I LS 

DRVASSHISDANLANTI IGKAVEHMFEGEHGS KDEWRGMVLAQA 

PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 

REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 

DDDFHIYVYDLVKKS 


6506 


1 


1350 


i£ VS P PTS CCLT VAVAD PGVS EG FRG FG AGCEM PG RGRCP DCGS T 
ELVEDSHYSQSQLVCSDCGCWTEGVLTTTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAYYQQAY 
RHSG IRAARLQKKEVLVGCCVLITCRQHNWPLTMGA I CTLL YAD 
LDVFSSTYMQIVKLLGLDVPSLCLAELVKTYCSSFKLFQASPSV 
PAKYVEDKE KMLSRTMQLVELANETWLVTGRH PLP VI TAATFLA 
WQSLQPADRLSCSLARFCKIiANVDLPYPASSRLQELLAVLLRMA 
EQLAWLRVLRLDKRSWKHIGDLLQHRQSLVRSAFRDGTABVET 
REKEPPGWGQGOGEGEVGNNSLGLPQGKRPASPALLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAOAAR 
QAATSVPNPP 


6507 


1878 


929 


RS HAS RL P E LPSGCL VLQVQ ELVQMSGMJ5AT VTI P I WQNKPHG A 
ARSWRR IGTNLPLKPCARAS FETLPN I SDLCLRDVP PVPTLAD 
IAWIAADEEETYARVRSDTRPLRHTWKPSPLIVMQRNASVPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTPDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETEVE 
v v c \j vz> v FitliCS AS PE CCKPEHKAACS S S E EDDC VS LS KAS S FA 

DMMGILKDFHRMKQSQDLNRSLLKEEDPAVLISEVLRRKFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTPRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQVHVWDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
I LQL VGDAVH PQ FKE I QKL I KE PAPDSGLLGLFQGQNS LLH 


6509 


2 


1053 


FVWNPRGGR KRRRQAAVTQAATRASGTPSPRDGTMT^GKLS VAN "' 
KAPGTEGQQQVHGEKKEAPAVPSAPPSYBEATSGEGMKAGAFPP 
APTAVPLHPS WAYVDPSSSSS YDNG FPTGDHELFTTPS WDDQKV 
RR VFVRKVYT I LL I Q LL VTLAWAL FTFCD P VKDYVQANPG WY W 
ASYAVFFATYLTLACCSGPRRHFPWNLILLTVFTLSMAYLTGML 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(/UAlanine, C=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine. K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan f Y*Tyrosine, X«=Unknown, '-Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








WIini i^VijOLijGlTALVCLSVTVFSFQTKFDFTSCQGVLFVL 

lmtlffsglilaillpfqyvpv;lhavtaalgagvftlplaldtq 
llmgnrrhslspeeyi fgalni yldi i yt ftpplqlfgtnre 


6510 


37 


1156 


PCALDGCPgKGAVHPLLSSAMGLLAFLRTQFVLHLLVGFVFWS 
GLVINFVQLCTLALWPVSKQLYRRl^CRLAYSLWSQLVMLLEWW 
SCTECTLFTDQATVBRFGKBHAVIILNHNFEIDFLCGWTMCERF 
GVUSSSKVLAKKELLWPLIGWTWYFLEIVFCKRKWEEDRDTVV 

EG LRRX»S d yp e ymw fll ycegtr ftetkhr vsmevaaa kg l pvl 

KYHLLPRTKGFTTAVKCLRGTVAAVYDVTLNFRGNKNPSLLGIL 
YGKKYEADMCVRRFPLEDIPLDEKEAAQWLHKLYQEKDALQEI Y 
NQKGMFPGEQFKPARRPWTLLNFLSWAT I LLS PLFSFVLGVFAS 
GS PLLILTFLGFVGAGNGHCR 


6511 
6512 


2541 


i 1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPLKGGK 
TMATNFSDIVKQGYVKMKSRKIiGIYRRCWLVFRKSSSKGPQRLB 
KYPD3KSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAIIPTDD 
i SARTFTCDSELEAEEWYKTLSVECLGSRLNDISLGEPDLLAPGV 
QCEQTDRFNVPLLPCPNLDVYGECKLQITHENIYLWDIHNPRVK 
L VS WPLCSLRR YGRDATR FTFRAGRMCDAGEGL YTFQTQ EG EQ I 
YQRVHSATLA1AEQKKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
MLPRSAYWHHITGSQN I AEASS YAG EGYGAAQASSETDLLNRFI 
LLKPKPSQGDSSEAKTPSQ 




159 


807 


FGKKSTWFPLSRSIiRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EVP SGSGRATGCERGG VRG ARC3GRAPGSS I WR KEPRMVCTR KTK 
TLVSTCVI LSGMTNI I CLL YVGWVTNYI AS VY VRGQEPAPDKKL 
EEDKGDTLKI IERLDHLENVI KQHIQEAPAKPEEAEAEPFTDSS 
L FAHWGQE L S PE GRR VALKQFQY YG YNA YLS DR L PLDR P 


6513 


2 


756 


FVS PE PGFS JiAQLNLI WQLTDTKQLVHSFAEGQDQGSAYANRTA 
LFPDLIiAQGNASIiRLQRVRVADEGS FTCFVS I RDFGSAAVS LQV 
AAPYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQ 
GVPLTGNVT TS QMANEQGL FDVHS I LRW LG ANGT YS CLVRN PV 
LQQDAHSS VTI TPQRS PTGAVEVQVPEDPWALVGTDATLRCS F 
S PE PG FS LAQLNL I WQLTDTKQL VHS FAEGQDQGS AYANR7AL F 
PDLLAQGNASLRLQRVRVADEGS FTCFVS I RDFGSAAVSLQVAA 
PYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVPWQDGQGV 
PLTGNVTTS QMANEQG L FDVHS I LR VVLGANGT YS CLVRN P VLQ 

QDAHSSVTITPQRSPTGAVEVQVPEDPWALVGTDATLRCSFSP 
EPG FS LAQLNL I WQLTDTRQLVHS FTEGR 


6514 


985 


302 


VGIPGPTISSAAEMEOLLDLDEELRYSLATSRAKNGRRAQQESA^ 
QAENHLNGKNSSLTLTGETSSAKLPRCRQGGWAGDSVKASKFRR 
KASEEI EDFRLRPQSLNGSDYGGDI PI IPDLEEVQEEDFVLQVA 
APPS IQ I XRVMTYRDLDNDLMKYSAIQTLD3E I DLKLLTKVLAP 

EHEVRERNPSWQDDVGWDWDKtiFTEVSSEVIiTEWDPLQTEKEDP 
AGQARHT 


6515 


1345 


305 


GRVGSRRRGAAVPGGCGAGSTQLEVSASASCX3ALGSAb>lNPiVV 
VHGGGAGPIS KDRKER VHQGM VRAAT VG YGILREGGS AVDAVEG 
AW ALE DD PEFNAGCGSVLNTNGEVEMDAS IM DGKDLS AG AVS A 
VQCIANPI KLARLVMEKTPHCFLTDQGAAQFAAAMGVPEIPGEK 
LVTERNiGCRLEKEKHE KG AQKTDCQKNLGTVGAVALDC KGNVAY 
AT3TGG I VNKMVGR VG DS PCLGAGGYADND IGAVS T TGHG ES I L 
KVNLARLTLFHIEQGKTVBEAADLSLGYMKSRVKGLGGLIWSK 
TGDWVAKWTSTSMPWAAAKDGKLHFGIDPDDTTITDLP ' 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSALQQL 
KGRRSEHRNENQEMPYSTNKEL I LGI MVGTAGISLLLLWYHKVR 
KPGIAMKLPEFI^I^NTFNSITLQDEIiroDQGTTVIFQERQIiQI 
LEKLNELLTNMEELKEEIRFLKEAIPKLEEYIQDELGGKITVHK 
ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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SEQ " 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H^Histidine, l=Isoleucine, K=Lysine. 
L=Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 


6517 






F^VPKAFNTRVEELNLDVLLQKVDHLRMSESGKSESFELLRDHK 
EXFRDE I EFMWRFARAYGDMYE LSTNTQEKKHYAN IGKTLS ERA 
INRAPMKGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKEHLDIA 
IKLI^EEPFLYYLKGRYCYTVSFCLSWIEKKMAATLFGKIPSSTV 
QEALHNFLKAEELCPGYSNPNYMYIAKCYTDLEENQNALKFCNL 
_ ALLLPTVTKEDKEAQKEMQKIMTSLKR 


""6518 


3 


1414 


ORVWGGSSSLNAiyiVYVRGHAKDyERWQRQGARGWDYAHCLPYFR 
KAQGHELGASRYRGADGPLRVSRGKTNHPLHCAFLEATQQAGYP 
LTE DMNGFQQEG FG WMDMT I H EGKR WS AACA YLH PALS RTNL KA 
EAErLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVlLSGGAIN 
S PQLLMLSG IGNADDLKKLG I PWCHLPGVGQNLQDHLEIY I QQ 
ACTRPITLHSAQKPLRKVCIGLEWLWKFTGEGATAHLETGGFIR 
S Q PG VPH PDI Q FH FLPS Q VI DHGR VPTQQEAYQVHVG PMRGTS V 
G W LKLRS AN PQDHP VI QPN YLS TETD I EDFRLCVKLTR E I FAQ E 
ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAWDPQTRVLGVENLRWDASIMPSMVSGNLNAPTIM1A 
EKAADII KGQPALWDKDVPVY KPRTLATQR 


6519 


242 


1098 


PAWNPG^KPRTRVKPRARSFPLPPPRAPRRRRHRLLRAVPGPSR" 
RHRC^RRAPPPPSTMGDAGSBRSKAPSLPPRCPCGFWGSSKTWf 
LCS KCFADFQKKQPDDDS APSTSNSQSDLFSEETTSDNNNTS IT 
TPTLS PSQQPLPTELNVTS PS KEECG PCTDTAHVSL I TPTKRSC 
GTDS QSENEAS P VKRP RLLENTERS EETS RS KQ KS RR RC FQCQT 

KLELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 
KM VKLDRK VG RS CQR I GEGCS 


6520 


3 


1113 


iSKKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLIiLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGE VKPLPRDKI KDK I KERDKE KBRBK K KHK 
VMNEIKKENGEVKILLKSGKEKPKTNIEDLQIKKVKKKKKKKHK 
ENEKRKRPKMYSKSIOTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSBFPFVSLKEPRVQNNLKRLDTLBFKQLI 
HIEHQPNGGASVrHCLQ 


6521 " 


3 


llll 


ERJa^PPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPOSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VP PTLLHAQ PHHLLLP AAAAAAS ANAKSRRPKE KRE KERRR HGL 
GGAR EAGGASREENGE VKPLPRDKI KDKIKERDKEKEREKKKHK 
VMNEIKKENGEVKILLKSGKEKPKTNIEDLQIKKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
HIEHQPNGGASVIHCLQ 


6522 


184 
1042 


1798 

391 1 


KLFKKATDTSQGELVHPKALPLIVGAQLIHADKLGEKVSDSTMP 
I R RTVNSTRET P P KS KLAEGEE E KPEPDISSEES VST VEEQENE 
TPPATSSEAEQPKGEPENEEKEENKSSEET"<FmRJfnncvT?vpw 

VKKTIPSWATLSASQLARAQKQTPMASSPRPKMDAILTEAIKAC 
FQKSGASWAIRKYIIHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEASYSLIRKYVSQYYPKLRVDIRPQLLKNA 
LQRAVERGQLEQITGKGASGTFQLKKSGEKPLLGGSLMEYAILS 
AIAAMNBPKTCSTTALKKYVLENHPGTNSNYQMHLLKKTLQKCB 
KNGWMEQI SGKGFSGTFQLCFP Y YPS PGVL FP KKEPDDSRDEDE 
DEDESSEEDSEDEEPPPKRRLQKKTPAKSPGKAASVKQRGSKPA 

PKVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 

WWLRPSPRSHRTPBSGRVLSLPRLPPPGMALSGSTPAPCWEED 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E» 
vjiULdmic rtcia, r— f nenyialanme, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P= Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X= Unknown, *«stop 
Codon, /=poasible nucleotide deletion, 
\=possible nucleotide insertion) 








ECLDYYGMLSLHRMFEWGGQLTECELELLAFLLDEAPGAAGGL 
^K>*^VJijftJjljUMjkKKGQCDESNLR^ 

R KRRRP VS PERYS YGTS S S S KRTEGS CRRRRQSS SS ANSQQG S P 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


; 1097 


ASCQTRRRTAALDSGERIAGRRSPIALAMASNFNDIVKQGYVKI 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNPHKVT 
ELHNI KNI TRLPRETKKHAVAI I FHDETS KTFACESELEAEEWC 
KHLCMECLGTRLND I S IjGEPDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQ I THEN I YLWD IHNAK VKL VMWP LS S LRR YGRDST 
WFTFESGRWCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6524 
" 6525 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIAIiAMASNFNDIVKQGYVKI 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNI KNI TRLPRETKKHAVAI I FHDETS KTFACESELEAEEWC 
KHLCMECLGTRLND I S LGE PDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQI TH EN I YLWD I HN AKVKLVMWPLS SLRR YGRDS T 
WFTFESGRMCDTGEGL FT FQTREGEM I YQ KVHS ATLAI AEQHER 
LMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 

lYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 




1 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS " 

PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 

KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 

NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 

SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 

EQYLTPLQQKEVTVRHLKTKLKESERRLHERESEIVELKSQLAR 

MR E D W I EEE CHRVEAQLALKEAR KE I KQLKQ V I ETMRSS LADKD 

KGIQKYFVDINIQNKKLESLLQSMEMAHSGSLRDELCLDFPCDS 

PEKS LTLNP PLDTMADGLS LEEQVTGEGADR ELLVGDS I ANSTD 

LFDEIVTATTTESGDLELVHSTPGANVLELLPIVMGQEEGSVW 

ERAVQTDWPYSPAISELIQSVLQKLQDPCPSSLASPDESEPDS 

MESFPESLSALWDLTPRNPNSAILLS PVETPYANVDAEVHANR 

LMRELDFAACVEERLDGVIPLARGGWRQYWSSSFLVDLLAVAA 

P WPT VLWAFS TQRGGTD P VYNI GALLRGCC WALHSLRRTAFR 

IKT 


6526 


2 


2034 


SGRAGEPEEWRGRQIIDSKETWIPFNSEDSQQLEEAYSSGKGCN " 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVP YSES FS QVLEETYMLAVTLDEWKKKLES PNREII ILHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRT VKRGVEN IS VDI HCGEP 
LQ I DHLVFWHG I G PACDLR FRS I VQCVNDFRS VS LNLLQTH FK 
KAQENQQ I GRV E FL PVNWH S P LHS TG VDVDLQR ITLPS INRLRH 
FTNDTILDVPFYNSPTYCQTIVDTVASEMraiYTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSEKGSLNIVMDQGD 
TPTLEEDLKKLQLSEFFDI FEKEKVDKEALALCTDRDLQE IGI P 
LGPR KKI LN Y FS TR KNS MG I KRP APQ PASG AN I P KES E FCS S SN 
TRNGDYLDVGIGQVSVKYPRLIYKPE I FFAFGS PIGMFLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKGRKRMHLELREGLTRMSMDLKNNLLGSLRMAWKSFTRAPY 
PALQASETPEETEAEPESTSEKPSDVNTEETSVAVKEEVLP I NV 
GMLNGGQR IDYVLQEKPIES FNE YLFALQSHLCY WE SEDTVLLV 
LKEIYQTQGIFLDQPLQ 


6527 " 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRLDTTLIDFTDMKCQRGDLS 
FIFNGDAAPSESFWLDNEQKVYQRIHHEESEMETEEEVDILMS 
SDIYSATLSTKSISFTRAQTGWLFREDKTERVGNFLADFYLVNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue. of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H»Histidine, I=Isoleucine, K-Lysine, 
L»Leucine, M^Methionine, N»Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
SsSerine, T= Threonine, VaValine, 
"^Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possihle nucleotide deletion, 
\=possible nucleotide insertion) 








LVLESRKRREHLSEEBILRNKAIMESLSKGGNIMEQNFEPIRRQ" 
SLT P P PQNT I TW E E Y I S AENG KAPH LGRELVC KES KKTFKATI A 
MSQEFPLGIEliLLNVLBWAPFKHFNKLREFVQMKLPPGFPVKL 
DIPVFPTITATVTFQEFRYDSFDGSIFTIPDDYKEDPSRFPDL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKILFCVLGLYIA " 
I P PL I KLCPG I QAKLI FLNFVRVP Y FI DLKKPQDQGLNHTCN Y Y 
LQPEEDVTIGWHTVPAVWWKNAQGKDQMWYEDALASSHPIILY 
LHGNAGTRGGDHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE 
RGMTYDALHVFDW I KARSGDN PVYI WGHSLGTGVATNIjVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYFYFPGFDWFFLDPI 
TSSGIKFANDENVKHISCPLLILHAEDDPWPFQLGRKLYSIAA 
PARS FRDFKVQ FV P FHS DLG YRHK Y I YKS PEL P R I LRE FLG KS E 
PEHQH 


6529 


363 


2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF" 

EG VEDES FLKW F CGNVNEQNVLS ER E LEAFS I LQKSGKPI L EG A 

ALDEALKTCKTSDLKTPRLDDKELEKLEDEVQTLLKLKNLKIQR 

RNKCQLMASVTSHKSLRLNAKEEEATKKLKQSQGILNAMITKIS 

NEWALTDEVTQLMMFFRHSNLGQGTNPLVFLSQFSLEKYLSQE 

EQSTAALTLYTKKQFFOGIHEWESSNESQFFNFLXIQTPSICD 

NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 

ESLHSLTSKAVDKENLDAKISSLTSEIMKLEKEVTQIKDRSLPA 

WR ENAQ LLKM P WKGDFDLQ IAKQD YYTARQE L VLNQL I KQKA 

SFELLQLSYEIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 

TDPSVSQQlNPRNTIDTKDYSrHRLYQVLEGENKKKELFLTHGN 

LEEVAEKLKQNISLVQDQLAVSAQEHSFFLSKRNKDVDMLCDTL 

YQGGNQLLLSDQELTEQFHKVESQLNKLNHLLTDILADVKTKRK 

TLANNKLHQMERE FYVYFLKDEDYLKD I VENLETQS KI KAVS LS 

D 


6530 


128 


2966 


GAAHHGAIVQVHPLLPGSSTIMIHDLCLVFPAPAKAWYVSDIQ 
ELYIRWDKVEIGKTVKAYVRVLDLHKKPFLAKYFPFMDLKLRA 
ASPIITLVALDEALDNYTITFLIRGVAIGQTSLTASVTNKAGQR 
INSAPQQIEVFPPFRLMPRKVTLLIGATMQVTSEGGPQPQSNIL 
FS I SNES VALVS AAGLVQGLAIGNGT VSGLVQAVDAETGKWI I 
SQDLVQ VEVLLLRAVR IRAP I MRMRTGTQMP I YVTG I TNHQN P F 
SFGNAVPGLTFHWSVTKRUVLDLRGRHHEASIRLPSQYNFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNPEIEAEQILMSPNSYIKLQTNRDGAASLSYRVLDGPEKVPV 
VHVDE KGFLASGSMIGTSTI EVI AQEP FGANQTI I VAVKVS PVS 
YLRVSMS PVLHTQNKEALVAVPIjGMTVTF'rVHFHDNSGDVFHAH 
SSVLNFATNRDDFVQIQKGPTNNTCWRTVSVGLTLLRVWDAKH 
PGLSDFMPLPVLQAISPELSGAMVVGDVLCLATVLTSLEGLSGT 
WSSSANSILHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKEVW 
SVPQRIMARHLHPIQTSFQEATASKVIVAVGDRSSNLRGECTPT 
QREVI Q ALH PET LI S CQSQFKPAVFD F P S QD VFT VE PQ FDTALG 

^VPPC T T*MI_ITOT Tnvnn I/tit k* wm» r * vm » n * *i n _ 

\j x v Lb I TMHRLTDKQR KHLS MKKTALWS AS LSS SHFS T EQ VGA 
EVPFS PGLFADQAE I LLSNH YTS SE IRVFGAPE VLENLE VKSGS 
PAVLAFAKEKSFGWPSFITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQAIAI PVTVAFWDRRGPGPYGASLFQHFLDSYQVMFFTLF 
ALLAGTAVMIIAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
S PTS PNALP PAR KAS P PSGL WS PA YAS H 


6531 




1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS - 
SLCMVITI YYDVKVRFJ VRGCGQY I S YRCQEXRNT YFAE YWYOA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLSLPLPNFHAGTEPDGLDPMVTLSLNLGLSFAELRRMYLFLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEVVNQDSLFPEPEPGPAPQVLLGPQGPGLIKGVAPPTL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid • 
sequence 


" Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=JUanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 

P=Prolir»& n-C1nham{na o_i\w»u;~. 

* ■ rA *-'* -me, w^w^utdinine, it—Argimne, 
S=Serine, T=Threonine, V=Valine, 
Wr=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ITDSTGTHLVLTVTNKNAHSPGLSRGSPQQPSSQPGSPAPAPSA 
QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDLFDI LIQSGS I S ADFKEPPSLPGKEKPS PKTVCWSPLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
EPLSLIDDLHSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 

DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLFSTDFLDGHD 
LQLHWDSCL 


6533 


1798 


373 


STISWLARVEPPRRSSGVGAARLRFPGGSRPLRARACVLALiAVL " 

ALLERNNADSMS AHS MLCERIAIAKE I* I KRAES LS RSRKGG I EG 

GAKLCSKLKAELKFLQKVEAGKVA1KESHLQSTNLTHLRAIVES 

AENLEEWSVLHVFGyTDTLGEKQTLWDWANGGHTWVKAIGR 

KAb/UjHN l WX/3RGQ YGDKS I IEQAEDFLQASHQQPVQYSNPHI I 

FAFYNSVSSPMAEKLKEMG1SVRGDIVAVNALLDHPEELQPSES 

ESDDEGPELLQVTRVDRENILASVAFPTEIKVDVCKRVNLDITT 

LITYVSALSYGGCHF1FKEKVLTEQAEQERKEQVLPQLEAFMKD 

KELFACESAVXDFQSILDTLGGPGBRERATVLI KR INWPDQPS 

ERALRLVASSKINSRSLTIFGTGDTLKAITMTANSGFVRAANNQ 

GVKFS V F I HQ P R ALTES KE ALAT PL P KD YTTD S E H 


6534 


47 


596* 


KATR F I S AAFWLNKQG VS PAKLPHTS WS WSLQTLS FLFSG DLA 
EKSLQCFPCSAMLLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYSYGATPYLFWMERTVEEAFILLVCLKPWRVASS 
LEKKEKEDESFQLLLGSRYNVLKAHCLLPLIRWLTSGDSLLSAQ 
PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNLETLLTiiAFLEIDKAFSSHARLS " 
ADATLLTSGTTATVALLRDGIELWASVGDSRAILCRKGKPMKL 
1 1UM1 ^bKJUJEKERIKKCGGFVAWNSLGQPHVNGRLAMTRSlGD 
LDLKTSGVIAEPETJCRIKLHHADDSFLVLTTDGINFMVNSQEIW 
D FVNQCH DPNEAAHAVTEQAI Q YGTEDNSTA WVPFGAWGK YKN 
SEINFSFSRSFASSGRWA 


6536 


242 


1174 


S LVKE MTNQ YG I L F KQEQAHDDA I WS VAWGTNKKENS ET VVTG S " 
LDDL VKVW KWR DERLDLQWS LEGHQLG VVS VDISHTLP I AAS S S 
uunti j. k jj w ulicjn L> ivy x Ko I DAGPVDAWTLAFSPDS Q Y LATGTHV 
GKVNIFGVESGKKEYSLDTRGKFILSIAYSPDGKYLASGAIDGI 
INI FD I ATG KLLHTL EGHAMP I RS LTFS PDSQLLVT AS DDG Y I K 
IYDVQHANLAGTLSGHASWVLNVAFCPDDTHFVSSSSDKSVKVW 
DVGTRTCVHTFFDHQDQVWGVKYNGNGSKIVSVGDDQEIHIYDC 


6537 


1638 


921 


nrfnppptqgpdpslvyrpdvdpevakdkasfrnytsgplLdrv 

FTTYKLMHTHQT VD F VRSKHAQ FGG FS YKKMTVMEAVDLLDGL V 
DESDPDVDFPNSFHAFQTAEGIRKAHPDKDWFHLVGLLHDLGKV 
LALFGEPQWAWGDTFPVGCRPQASWFCDSTFQDNPDLQDPRY 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTLS PQSTCTR 


6538 
6539 


3345 
218 


2412 
339 


P YLYDFLDAL I TCQTAPE EAF I KLDGLAGM LTEQLRRLTKQ VQ E 
ARHNRDDEAI KKAVNE YDETMEK YX P VLMAQAKI YWNLENYPM V 
EKIFRKSVEFCNDHDVWKLNVAHVLFMQENKYKEAIGFYEPIVK 
KHYDNILNVSAIVLANLCVSYIMTSQNEKAEELMRKIEKBEEQL 
SYDDPNRKMYHLCIVNLVIGTLYCAKGNYBFGISRVIKSIjEPYN 
KKLGTDTWYYAKRCFLSLLENMSKHMIVIHDSVIQECVQFLGHC 
BLYGTNIPAVIEQPIjEEERMHVGICNTVTDBSRQLKALIYEIIGW 
NK j 


6540 


3 


391 


FI^AASPHPHFSSLAPHPDOPEFTPVQDELEAMELWGPGV 
LERLMLLLLRRPEDAMAECPTU5EAVTDHPDRLWAWEKFVYLDE 
KQHAWLPLTIEIKDRLQLRVLLRREDWLGRPMTPTQIGPSLLP 
IMWQLYPDGRYRSSDSSFWRLVYHIKIDGVEDMliLELLPDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 

H = Hi?5 t" "5 fl 1 nf* T-Tonlanrina 7-!irr4n& 

(i-nj.obiujLiiCi i-iaoieucins, Jv=LjySme , 

L^Leucine, M= Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=»Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, YVTyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6541 


1165 


536 


RTijVQRRIbMULRKPARGRDLRGRGRGTPRGGRKGLLPl'PDBFP 
R FEGGRKP DS WDGNRE PGPGHEH FRDT P R PDHP PHDGHS P AS RE 
RSSSLQGMDMASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
S RGGRSGS NWGRGSNMNSG P P RRGASRGGGRG R 


£642 


3 


j 3775 


SWPRGRGETGGHPGALRTRTMQKSTOYNEGHALYIiAFLARKEGT 
KRGFLSKKTAEASRWHEKWFALYQNVLFYFEGEQSCRPAGMYLL 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIEREV1.MQKYIHLVQIVET 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDIKKIKKVQSFMRGWLCRRKWKTIVQDYICSPHAESMRKR 
NQIVFTMVEAESEYVHQLYILVNGFLRPIjRMAASSKKPPISHDD 
VS S I FLNS ETIMFLHEI FHQGLKAR I ANW PT L I LADL FD I LLPM 
LNIYQEFVRNHQYStfQVIANCKQNRDFDKIjLKQYEANPACEGRM 
LETFLTYPM FQI PRY 1 1TLHELLAHTPHEHVBRK5LEFAKSKLE 
ELSRVMHDEVSDTENIRKNLAIERMIVEGCDILLDTSQTFIRQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGGKLHLLKTGGVLSLI DCTLIEEPDASDDDS KGSGQVFGHLDF 
KIWEPPDRAAFTVVLLAPSRQEKAAWMSDISQCVDNIRCNGLM 
TIVFEENSKVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YASVERLLERLTDLRFLSIDFLNTFLHTYRIFTTAAWLGKLSD 
I YKR PFTS I PVRSLELFFATSQNNRGEHLVDG KSPRLCRKFSS P 
PPLAVSRTS S P VRARKL5LTS PLNS KIGALDLTTS SS PTTTTQS 
PAAS P PPHTGQ I PLDLSRGLSSPEQSPGTVEENVDNPRVDLCNK 
LKRS IQKAVLESAPADRAGVESS PAADTTELS PCRSPST PRHLR 
YRQPGGOTADNAHCSVS PASAFAI ATAAAGHGS PPGFNNTERTC 
DKEFI IRRTATWRVLNVLRHWVSKHAQDFELNNELKMNVLNLLE 
EVLRDPDLL PQERKAAANI LMALS QDDQDD I H LKLED 1 1 QMTDC 
MKAECFESLSAMEIAEQITLLDHVIFRSIPYBEFIjGQGWMKIiDK 
NERTPYI MKTSQH FNDM SNLVASQ I MNYADVSS RANA I E KWVAV 

adicrclhnyngvleitsalnrsaiyrlkktwakvskotkalmd 

KLQKTVSSEGRFKNLRETLKNCMPPAVPYLiGMYLTDLAFIEEGT 

pnfteeglvnfskmrmishiireirqfqqtsyridhqpkvaqyl 

LDKOLI r DEDTLYELSLKIEPRLPA 


6543 


1857 


950 


lkvhlqtqqevklrmtgmalrwrtdg i lalysgls as lcrqmt 
ysltrfaiyetvrdrvakgsqgplpfhekvllgsvsglaggfvg 

SGATMASSRGALVTVGQLS CYDQAKQLVLSTGYLSDNI FTHFVA 
SFIAGG CAT FLCQ P LDVLKTRLMNS KG E YQG VFHCAVETAKLG P 
LAF YKGLVPAGI RLI PHTVLTFVFLEQLRKNFGI KVPS 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASXS 
E PSDTD PE PRTLNPS P AGW FVQQHPE LE IMS S FRE R FGRNWLQ Y 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQES PQ KMS E E VRAE PQ E EE EEKEG KEEKEEGEMAPLP EAHLG 
EGKQKECP ' 


6545 


176 


560 


PPHSHAALLPAAMTPLLTLILWLMGLPLAQALDCHVCAYNGDN 
CFNPMR C PAMVAYCMTTRT YYTPTRMKVS KS CVP RCFET VYDG Y 
SICHASTTSCCQYDLCNGTGLATPATLALAPILLATLMGIiL 


6546 


1657 


364 


HLLNGLDEVAAFFVADUSAIVRKHFCFLKCbPRVRPFYAVKCNS 
SPGVLKVIAQLGLGFSCANKAEMELVQHIGIPASKIICANPCKQ 
I AQ I KYAAKHG IQLLS FDNEM E LAKWKSH PS AKMVLC IATDDS 
HS LSCLSLKFGVSLKSCRHLLENAKKHHVE WGVS FHIGSGCPD 
PQAYAQSIADARLVFEMGTELGHKMHVLDLGGGFPGTEGAKVRF 
EEIASVINSALDliYFPEGCGVDIFAELGRYYVTSAFTVAVSIIA 
KKEVLLDQPGREEENGSTSKTIVYHLDEGVYGIFNSVLFDNICP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A*Alanine, C=Cysteine D=Asoartic Ar-iH v 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y«Tyrooine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=posaible nucleotide insertion) 








TPILQKKPSTEOPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDW 
LVFDNMG A YTVGMGS PFWGTQACH ITYAMS RVAWEALRRQLMAA 
EQEDD VBG VC KPLS CG WEI TDTLCVGPVF7PAS IM 


6547 


1 


541 


IjHSKYIaAPALCSQPGMMRCCRRRCCCRQPPHAJjRPLLLLPLVLL 
PPLAAAAAGPNRCDTIYQGPAECLIRLGDSMGRGGELETICRSW 
NDFHACASQVLSGCPEBAAAW/ESLQQEARQAPRPNNLHTLCGA 
P VHVR ERGTGS ETNQETLRATAPAL PMAPAP P LLAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG 
I KG CGI TFTLGKG TEVG P J .K T T, c d row a 


6549 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA " 

ARAFSKAPARLSRPRAREEPPDPGRRYIQEEIIQARKHKLIKMC 

SSVAAKLWFLTDRRIREDYPQKEILRALKAKCCEEELDFRAWM 

DEWLTIEQGNLGLRINGELITAYPQWWRVPTPWVQSDSDIT 

VLRHLEKMGCRLMNRPQAILNCVNKFWTFQELAGHGVPLPDTFS 

YGGHEN FAKM IDEAE VLE FPMVVKNTRGHRGKAV FLAR DKHHLA 

DLSHLIRHEAPYLFQKYVKESKGRDVRVIWGGRWGTMLRCST 

DGRMQSNCSLGGVGWJCSLSEQGKQIiAIQVSNIUGMDVCGIDLL 

MKDDGSFCVCEANANVGFIAFDKACNLDVAGI I ADYAASLLPSG 

R LTRRMSLbS WS TAS ETS EPE LG P PASTAVDNMS ASS SS VDS D 

PESTERELLTKXiPGGT.F"NMNfrYr.T &XIPrirr t \m 


6550 


2293 


922 


FRVSRDGAPDCGlEUMGbAMEHGGSYARAGGSSRGCWYYLRYFF~" 

LFVSLIQFLIILGLVLFMVYGNVHVSTESNLQATERRAEGLYSQ 

LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDRINASFR 

QCQGDRVIYTNNQRYMAAIILSEKQCRDQFKDMNKSCDALLFML 

nvn.Tn.iuDV£iiAwiM jlv.1 KUiUSo VLIiNKRVAEEQIjVECVKTRE 

LQHQERQIAKEQLQKVQALCLPLDKDKFEMDLRNLWRDS I I PRS 

LDNLGYNLYHPLGSELASIRRACDHMPSLMSSKVEELARSLRAD 

IERVARENSDLQRQKLEAQQGLRASQEAKQKVEKEAQAREAKLQ 

AECSRQTQIiALEEKAVr.RKERDNLAXELEEKKREAEQLRMELAI 

RNSALDTCIKTKSOPMMPVSKPMnDVDMDnDTn-DAeT sroirnvT 

LESQRPPAGI PVAPSSG 


6551 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFIiADPLNKSSYKYE 
ADTVDI^CVISDMEVIELNKCTSGQSFEVILKPPSFDGVPEFN 
ASIi PR RR DPSLEE I QKKLEAAEER RKYQE AELLKHLiAEKREHER 
EVIQKAIEENNNFIKMftKEKLAQKMESNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6552 


157 

I 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWC V I S DME VIBLN KCTSGQS FE VI LKP PS FDGVPE FN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER 
EVIQKAIEENNNFI KMAKE KLAQKME SNKENREAH LAAMLE RLQ 
EKDKHAEEVRKNKELKEEASR 


6553 


2 


1807 


FWSKMAAHLSYGRVNLNVLREAVRRELREFLDkCAGSKAIVWD 
E YLTG P FG L IAQ YS LLKEHEVE KMFTL KGNRL PAAD VKN I 1 FF V 
RPRLBLMDIIAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
LGVLGSFIHREEYSLDLIPFDGDLLSME3EGAFKECYLEGDQTS 
LYHAAKGLMTLQALYGTIPQIFGKGECARQVANMMIRMKREFTG 
SQNSIFPVFDNLLIiLDRNVDLLTPLATQLTYEGLIDEIYGlQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEELYAEIRDKN 
FNAVGS VLS KKAK 1 1 S AA FE ERHNAKTVGE I KQFVS QL PHMQAA 
RGSLANHTS I AEL I KDVTTS E D F FDKLTVEQE FMSG I DTDKVNN 
YIEDCIAQKHSLIKVLRLVCLQSVCNSGLKQKVLDYYICREILQT 
YGYmiLTLHNLEKAGLLKPQTGGRNNYPTIRKTLRLWMDDVNE 
QNPTD I S YVYSGYAPLS VRLAQLIiSRPGWRS I EEVLR I LPGPHF 
EERQPLPTGLQKKRQPGENRVTLIFFLGGVTFAEIAAliRFLSQL 
EDGGTEYVIATTKLMNGTSWIEALMEKPF 
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SEQ 
ID 
NO: 


j Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A-Alanine, C=Cysteine, D=Aspartic Acid, e= 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 

H=Histidine, I«Isoleucine, K=»Lysine, 

L=Leucine, Methionine, N=Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S^Serine, T=Threonine, V=Valine, 

W= Tryptophan . Y=TvrosinA YnTJnVn<M.r« + o*. 

* " ~ ' * »7«-^aj.jie, AounKnOWn, *=Stop 

Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 


6554 


119 


1244 


FBMGSQVS VSSGALHW I VGGGFGGIAAASQLQALNVP FMLVDM 
KDSFHHNVAALRASVETGFAKKTFISYSVTFKDNFRQGLVVGID 
LKNQMVLLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDM VRQVQRSRF I VWGGGSAGVEMAAE I KTEYPBKEVTLIH 

SQVALADKELLPSVRQEVKEILLRKGVQLLLSERVSNLEELPLN 

EYREYIKVOTDKGTFVATMT.vtt ctt TVTMpr * "Ln-.Tr* r-.^-, ^.^^ 
» *. * j. v v * v ma i r. v/ii n jj v xlmk. rGI KINSS AYRKAFESRLAS 

S G ALR VNE 1 1 LQ VEGHS NVY A I G DCAD VRT P KMAY LAG LHAN I AV 

AN I VNS VKQR PLQA YKPGALTFLLS MG RNDG VGQ I SG FYVGRLM 

VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


498 


IHMALLRK1K0VLLFLLIVTLCVILYKKVHKGTVPXNDA5DESE - 
TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNT LTR I RKW I EHS KLREINFKIVE FNPKG LKG KI RPDS S R PE 
LLQPLNFVRFYLPlilHQHEKVIYLDDDVIVQGDIQELYDTTLA 
LGHAAAFSDDCDLPSAQDINRLVGLQNTYMGYLDYRKKAI KDLG 
ISPSTCSFNPGVIVANMTEWKHQRI7KQLEKWMQKNVEENLYSS 
£>l»t»Cit»VATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKLLHWNGRHKPWDFPS VHNDLWESWFVPDPAGI FKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVIltPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNSQGQLRVPWFVTNAGNILQHSKAQELSALU3 
CEVDADQVIL2HSPMKLFSEYI1EKRMLVSGQGPVMENAQGLGFR 
NWTVDELRMAFPLLDMVDLERRLKTTPLPRNDFPRIEGVLLLG 
EPVRWETSLQIjIMDVLLSNGSPGAGIjATPPYPHLPVIiASNMDLL 
WMAEA KMPR FGHGTFLLCLET I YQKVTG KELR Y EGLMG K PS I LT 
YOYAE DLIRRQAERRGWAAP I R KLYAVGDNPMSDVYGANLFHQY 
LQKATHDGAPELGAGGTRQQQPSASQS CI S I LVCTGVYNPRNPQ 
.STEPVLGGGEP P FHGHRDLCFS PGLMEASHWNDVNEAVQLVFR 
KEGWALE 


6557 


2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN " 

KSPQSN9PVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 

SKLQFNTTTfCR5DTVMEKRSFKVPLGKGRRCVVIiADGFYEWQRC 

QGTNQRQPYFIYFPQIKTEKSGSIGAADSPENWEKVWDNWRLLT 

MAGIFDCWEPPEGGDVLYSYTIITVDSCKGLSDIHHRMPAILDG 

EEAVSKWLDFGEVSTQEALKLIHPTENITFHAVSSWNNSRNNT 

* v v vivi\jiijK/toVjboyKMJ^wiiA.TKSPKKBDSKTPQKE 

ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ 


6558 


21 


1138 


jijajvjo v_ijouvjxvii/\/\n.GCj^ a^t, v KKliRX^ FAS 
VASCDAAVAQCFLAENDWEMERALNSYFEPPVEESALERRPETI 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENGSMFSLITWNID 
GLDLNNLSERARGVCS YLALYS PDVI FLQEVI PPYYS YLKKRSS 
NYEI1 TGHE EG YFTA I MLKKS R VKLKS QE 1 1 PFP STKMMRNLLC 
VHVNVSGNELCLMTSHLESTRGHAAERMNQLKMVLKKMQEAPES 
ATVIFAGDTNLRDREVTRCGGLPNNIVDVWEFLGKPKHCQYTWD 
TQMNSNLG I TAAC KLRFDR I FFRAAAE EGH 1 1 PRSLDLLGLE KL 
DCGRFPS DHWGLLCNLDI IL 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPiAMDCCASRSCSVPTGPATTICSS 
DKSCRCGVCLPSTCPHTVWLI.EPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCLPRGC 


6560 


3 


1435 


TATSGGIWLKKKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GVDHTKMS LHGASGGHERSRDRRRS S DRSRDSSHERTE SQLTPC 
IRNVTSPTRQHHVEREKDHSSSRPSSPRPQKASPNGSISSAGNS 
SRNSSQSS SDGSCKTAGEMVF VYENAKEGARNIRTS ERVTLI VD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGE YEVAE 
GIGSTVFRAI LDYYKTG I IRCPDG I S I PELREACDYLCI SFEYS 
TI KCRDLS ALMHELS NDGARRQFE F YLEEM I LPLMVAS AQS GER 
BCHIWLTDDDWDWDEEYPPQMGEEYSQ1IYSTKLYRFFKYIE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H*=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P=Proline, Q=Glutamine, R=Arglnine, 
S=Serine. T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NRDVAKSVLKBRGLKKIRLGIEGVPTYKEKVKKRPGGRPBVIYN 
WQRPFIRHSWEKEEGKSRHVDFQCVKSKSITNLAAAAADIPQD 
QLWMH PTPQVDELDI LP IHPPSGNSDLDPDAQN PML 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLSPEPSPSWPSHS 

PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGFGNLL 

AKQLVDRGMQVLAACFTEEGSQKLQRDTSVTILQTTIjLDVTKSES 

IKAAAQWVRDKVGEQGLWALVNNAGVGLPSGPNEWLTKDDFVKV 

INVNLVGLI3VTLHMLPMVKRARGRWNMSSSGGRVAVIGGGYC 

VSKFGVEAFSDSIRRELYYFGVXVCIIEPGNYRTAILGKENLES 

RtlRKLWERLPQETRDSYGEDYFRIYTDKIiKNIMQVARPRVRDVI 

NSMEHAIVSRSPRIRYNPGLDAKLLYIPLAKLPTPVTDFILSRY 
LPRPADSV 


6562 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
LYEWFLGKRSEGVPVSG PMLI EKAKDFYEQMQLTBPCVFSGGWL 
WRFKARHGIKKLDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGlQHIiPVAYKAQGNAWVDKEIFS 
DW FHH I F VP S VREHFRT IGLP EDS KAVLL LDS SRAHPQE AE LVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAIFSVACAWNAVPSHVFRRAWRXLWPSVAFAEGSSSEEE 
LEAECFP VKPHNKS FAHI LELVKEGSSCPGQLRQRQAAS WGVAG 
REAEGGRPPAATSPAEWWSSEKTPKADQDGRGDPGEGEEVAWB 
QAAVAFDAVLRFAERQPCFSAQEVGQLRALRAVFRSQQQVRRRR 
GALGAWKVEALQEGPGGCGATAQS PLP CSS TAGDN 


6563 


1319 


2694 


LARPAQP VLLRE PEGAG PP VPAGHLVHHLQGGHLRERAHPDLEA " 

HEHPLPCDQMFWRQMGGHI/RMVEANSRGWWGIGYDHTAWVYTG 

GYGGGCFCX5IASSTSNIYTQSDVKCVHIYENQRWNPVTGYTSRG 

LPTDRYf4WSDASGU3ECTKAGTKPPSLQWAV7VSDWFVDFSVPGG 

TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 

EVPP I ALRDVS 1 1 PESPGAEGSGHS I ALWAVSDKGDVLCRLGVS 

ELNPAGSSWLKVGTDQPFAS I S IGACYQVWAVAREX5SAF YRGSV 

YPSQPAGDCWYHIPSPPRQRLKQVSAGQTSVYALDENGNLWYRQ 

GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVIANKVQGSHSLS 

RGTVCHRTGVQPHEPKGHGWDYGIGGGWDKISVRANATRAPRSS 

SQEQEPSAPPEAHGPVCC 


6564 


1 


975 


APGS CALWS Y CGRG WSRAM RG CQLLGLRSS W PGDLLS ARLLSQE 
KRAAETHFGFETVSEEEKGGKVYQVFESVAKKYDVMNDMMSLGI 
HRVWICDLUiWKMHPLPGTQLLDVAGGTGDIAFRFLNYVOSQHQR 
KQKRQLRAQQNLSWEEIAKEYQNEEDSLGGSRWVCDINKEMLK 
VGKQKALAQGYRAGLAWVLGDAEELPFDDDKFDI YTIAFG I RNV 
TH I DQALQEAHRVLKPGGRFLCLE FSQVNNPL I S RLYDLYS FQ V 

1PVLGEVIAGDWKSYQYLVESIRRFPSQEEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


999 


RSAVANGLTKRRMGLKI^GRYISLILAVQIAYLVQAVRAAGKCD~ 
AVFKGFSDCLLKLGDSMANYPQGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCQBGAKDMWDKLRKES KNLNIQGSLFELCGSGNGAAGS 
LLP A FP VLLVSLSAALAT WL S F 


6566 " 


3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG " 
HVPPGCSQGLNPLYYNLCDRSGAWG I VLEAVAGAGI VTTFVLTI 
I LVAS L P F VQDT KKRS LLGTQV FFLLGTLGLFCLVFACVE KP DF 
STCASRRFLFGVLFAICFSCLAAHVFALNFLARKNHGPRGVn/IF 
TVALLLTIiVEVI INTE WL 1 1 TLVRGSGEGGPQGNS S AGWAVAS P 
CAIANMDFVMAL I YVMLLLLGAFLGAWPALCGR YKR WRKHG VFV 
LLTTATS VAI WWWI VM YTYGNKQHNS PTWDDPTLAI ALAANAW 
AFVLFYVI PEVSQVTKS SPBQS YQGDMYPTRGVG YETI LKEQKG 
QS M FVENKAFSMD E P VAAKRP VS P YS G YNGQLLTS VYQ PTBMAL 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H*=Histidine, I»Isoleucine, KaLysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Godon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MHKVPSEGAYDI I LPRATANSQVMGS ANSTLRAEDMYSAQSHQA 
ATPPKDGKNSQVFRNPYWD 


6567 


125 


863 


TKRSNLKAYACS I HH IRTMS YVF VN DSSQTNVP LLQACI DGDFN 
YS KR LLESG FDPM I R DS RGRTGLHLAAARGNVD I CQLLH K FGAD 
LLATDYOGNTAL'-TT^nHViyrTrtFT VTmnvrun^ATm \tr 

AKRRGVNKDVIRLLESLEEQEVKGFNRGTHSKLETMQTAESESA 
MBSHSLLNPNLQQGEGVLSSFRTTWQEFVEDLGPWRVLLLIFVI 
ALLSLGIAYYVSGVLPFVENQPBLVH 


6SSB 


3 


1183 


HASDRLLVLPDNYSHPSQASANLQGPSRTTELFHPTLASISSPM 
LEGAELYFNVDHGYI.EGLVRGCKASLLTQQDYINLVQCETLEDL 
KIHLQTTDYGNFLANHTNPLTVSKIDTEMRKRLCGEFEYFRNHS 
LE PLS TFLTYMTCS YMI DNVI LLMNGALQKKS VKEILGKCHPLG 
R FTEM EAVN I AETPSDL FNAI LI ET PLAP FFQDCMSEKAJbDELN 
IELLRNKLYKSYLEAFYKFCKNHGDVTAEVMCPILEFEADRRAF 
IITLNSFGTELSKEDRBTLYPlTGKLYPEGLRLIiAQAEDFDQMK 
w vj\un i«~v i ivf lit bAVvjObCaC»KTtiEDv FYERE VQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKIN3YIPIL 


6569 


205 


1532 


RRRGPQRLGHGRPTPLLCRWRTAGPSHWEKQARAFQGLRPVDPR 
RMSWLFPLTKSASSSAAGSPGGLTSLQQQKQRLIESLRNSHSSI 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVYVTSPLVNNFTMHSDLGKHQSLIiDEFWKNPPVtA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVSSSTTSHTTAKPAAPS FGVLSNLPLPI PTVDAS I PTSQNG 
c Lti twitru v rUAt i'c.ljbhXjb VSQuTDMNEQEEVLLEQFLTLPQLK 
QI ITDKDDLVKS IEELARKNLLLEPSIiEAKRQTVLDKYELLTQM 
KST FE KKMQRQHE LS ESCSAS ALQARLKVAAHB AEE ESDN IAE D 
FLEGKMEIDDFLSS FMEKRTI CHCRRAKEEKLQQAIAMHSQFHA 
PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGSKALPAPIPLHPSLQLTNYSFLQAVNTFPATVDHLQG 
LYGLS AVQTMHMNHWT LG Y PNVHE I TRST I TEMAAAQGI..VDAR ? 
PFPALPFTTHLFHPKQGAIAHVLPALHKDRPRFDFANLAVAATQ 
EDPPKMGDLSKLS PGLGSPISGLSKLTPDRFCPSRGRLPSKTKKE 
FICKFCGRHFTKSYNLLIHERTHTDERPYTCDICHKAFRRQDHL 
RDHRYIHSKEKPFKCQECGKGFCQSRTLAVHKTLHMQTSSPTAA 
SSAAKCSGETVICGGT 


6571 


169 


656 


APDMKRKKLQKLTDTLTKNCKHLFRGFDKDNDGCVNVLEWI HGL 
S LFLRGSLE EKM KYC FE VFDLNGDGF I S KE EMFHML KNSLL KQ P 
S EEDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCLPDPKSQMEFEAQVFKDPNE FNDM 


6572 




1646 


T P ERAOPG ALLGAAG CCVCGGR W W PRB HF Rf: V PQ Q A if Mfi q v 15 d m 
LS CS ERHQ K LVDBNYCKKLHVQAL KNVNS Q I RNQM VQN3NDNRV 
QRKQFLRLI^NEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELAKLKHESLKDEKMRQQVRENS I ELRELEKKLKAAYMNKERAA 
QIAEKDAIKYEQMKRDAEIAKTMMEEHKRI I KEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKJCKQEAYEQLLKEKLMIDEIVRKIYEED 
QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKII 
EFANMQQQREEDRMAKVQENEEKRLQLQNALTQKLEEMLRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEQMA 
LKELVLQAAKEEEENFRKTMLAKFAEDDRI ELMNAQKQRMKOLE 
HRRAVEKL I EERRQQFLADKQRELEEWQLQQRROGF INAI IEEE 
RLKLLKEHATNL LG YLPKG VFKKEDDI DLLGE EFRKVYQQRS EI 
CEEK 


6573 


767 


275 


gggggesqsfraqdgtrtpatdclmylqgprklmi*Oggydmvqk 
l fld ffrrrlsqr ptaee leqrni lkprneqe eqee kre i krrl 
trklsqrptveelrerkilirfsdyvevadaqdydrradkpwtr 
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[ SEQ 
ID 
NO: 


1 Predicted 

I beginning 

1 nucleotide 
location 

1 corresponding 
to first 
amino acid 
residue of 

1 amino acid 

J sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
<A=Alanine f C=Cysteine, D=Aspartic Acid, b» 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
H=Tryptophan, Y=Tyrosine, X-Unknown. *«Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 


6574 


204 


1159 


LTAADKVSRGECWRVGGRTVCWVSLGSPLGSV " 

LESSVPVSv^VFWACGVSWTGAAGbQDGALSDTMARNAEKAMTA 
LARFRQAQLEEGKVKERRPFLASECTELPKABKWRRQIIGEISK 
KVAQIQNAGU3EFRIRDLNDEINKLLREKGHWEVRIKELGGPDY 
GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 
PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 
KWKAEREARLARGEFCEEEEEEEEEINIYAVTEEESDEEGSQEKG 
GDDSQQKFIAHVPVPSQQElEEAIiVRRKKMELLQKYASETLQAQ 


6575 
6576 


117 


820 


spalasqsggiteekmlepqengvidLpdyehvedetfppfppp 

ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 

serglpalrhvfbkakfkgkgheaedlkmlirhmehwahrlfpk 

LQ FEDFI DRVE YLGSKKE VQTCIjKR IRLDLPI LHEDF VSNMDPv 
AENN EHD VTSTELD P F LTNLS E S EM FAS ELS I S LTEEQQQR I ER 
NKQLALERRQAKLP 


6577 


| 1 


1060 


PbPUALVUUKKGALRLLVARLVLTVSAPAEVRRKVLRPVtSWMD ' 

RETRALADSHFRGLGVDVPGVGOAPGRVAFVSEPGAFSYADFVR 

G FLLPNL PC VFS S AFTQG WGSRRRWVTPAGRPD FDH LLRT YGD V 

WPVANCGVQEYNSNPKEHMTLRDYITYWKEYIQAGYSSPRGCL 

YLKDWHLCRDFPVEDVFTLPVYFSSDWLNEFWDALDVDDYRFVY 

AGPAGSWS PFHADIFRS FS WSVNVCGRKKWLLFPPGOEEALRDR 

HGNLPYDVTSPALCDTHLHPRNQLAGPPLEITQEAGEMVFVPSG 

WHHQVHNLVMCCFSCPLSGAFIiQEDGSTTSPLSQPELGWNGVAH 
G 


" *578 T 


2271 


987 


SDRMASDDFiJiviKAMLEAPYKkEEDEQQRKEVKKDYPSNTTSS 
TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
R E KS P VPJ2P VDNLS P EE RDART VFCMQLAARI RPRDLEDF FS A V 

GKVRDVRIISDRNSRRSKGIAYVEFCEIQSVPLAIGLTGQRLLG 
VPII VQASQAEKNRIiAAMANNLQXGNGGPMRLYVGSLHFNITED 
MLRGIFEPFGKIDNIVLMKDSDTGRSKGYGFITFSDSECARRAL 
EQLNG FELAGR PMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAGIQLPSTAAAAAAAAAAQAAALQLNGAVPLGA 
LNPAALTALSPALNLASQCLQLSSLFTPOTM 




377 


1489 


PSSSATMNRafIjKJ^TILHMALTGASDPSAEAEANGEKPFLLRA 

lq i al ws l yw vts i smvflnkyllds ps lrldt p i f vt f yq cl 
vttllckglsalaaccpgavdfpslrldlrvarsvlplswfig 
mitfnnlclkyvgvafynvgrslttvfnvllsylllkqttsfya 
lltcgi i iggfwlgvdqegaegtlswlgtvfgvlaslcvslnai 
yttkvlpavdgsiwrltfynnvnacilflplllllgelqalrdf 

AQLGSAHFWGMMTLGGLFGFAIGYVTGLQIKFTSPIjTHNVSGTA 

KACAQTVLAVLYYEETKSFLWWTSNMMVLGGSSAYrMVRGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 
6580 


2 


711 


RPPRVWYPELkhXSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT" 
IYMGKDKYENEDLIKHGWPEDIWFHVDKLSSAHVYLRLHKGENI 
fiu± v UMULAHLVKANS IQGCKMNNVNWYTPWSNLKKTADM 
DVGQ IG FHRQ KD VK I VTVEKKVNE I LNR L EKTKVE RF PDZAAE K 
ECRDRE ERNE K KAQ I QEM KKREKEEMKKKREMDELRS YSSLM KV 
ENMSSNQDGNDSDEFM 




62 


1571 

< 

J 


LVALKNWKP KGTN1 PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP 
RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 
PREALSQLRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQELQ 
\WVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
iCISSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
3DPRKVRDCRLSTQHEES ADEQKGSEAEGLKGDI IS VI I ANKPE 
\SLERQCVNLENEKGTKPPLQEAGSKKGRESVPTKPTPGERRYI 
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SCO 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

hn f i rel- 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amano acid segment containing signal peptide 
(A- Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HeHistidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=0nknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCTKCGKAFSHSSNLTb 
HYRTHLVDRPYDCKCGKAFGQSSDLLKHQRMHTEBAPYQCKDCG 
KAFSGKGSLIRHYRIHTGEKPYQCNECX5KSFSQHAGLSSH0RLH 
TGEKPYKCKECX3KAFNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
S KSNLSKHQRVHTGEGEAP 


CCOI 
DDOl 


22 8 


476 


RVFLKDLSSTPMASNNTAS IAQARKLVEQLKMEANIDR I KVSKA 
AADLMAYCEAHAKEDPIjLTPVPASENPFREKKFFCAIIj 


6582 


1428 


1 718 


CFli'KTHCS PVSVPYIjS PLVLRKELESLLENEGDQVI HTSSF1N 

qhpu fwtlvwyfrrldlpsnlpgli lts ehcnegvql plssls 
qdsklvyiqllwdninlhqepreplyvswrnfnsekkssllsee 
qqbtstlvetirqsiqhnnvlkpinllsqqmkpgmkrqrslyre 
ilflslvslgrenidieafdneygiaynslsseilerlqkidap 
psasvewcrkcfgapli 


6583 


487 


41 


rifsmtsgrlrwrctwrpatalwsaslrlgtssmhpsprsislp 
lsmmlsplpsntrglsptalfrspdsehatscprlhlwrcrapl 
rspspusrlqvlprsplhvhthnsgkevlglqvqrsrsgtgpac 
sqagsgavqggnwcif 


6584 


189 


1750 . 


plpmaalgpssqnvteywrvpknttkkynimafnaadkvnfat 
wnqarlerdlsnkkiyqeeempesgagsefnrklreearrkkyg 
ivlkefrpedqpwllrvngksgrkfkgikkggvtentsyyiftq 
cpdgafeafpvhnwynftplarhrtltaeeaeeewerrnkvlnh 
fsimqqrrlkdqdqdedeeekekrgrrkaselrihdleddlems 
sdasdasgeeggrvpkakxkaplakggr kkkkkkgs ddeafeds 
ddgdfegqevdymsdgssssqeepeskakapqqeegpkgvdeqs 
dsseeseeekppeedkeeeeekkaptpqekkrrkdsseesdsse 
esdidseassaffmakkktppkrerkpsggssrgnsrpgtpsae 
ggstsstlraaas kleqgkrvsempaakrlrldtgpqs lsgkst 

PQPPSGICTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTKK 

tglsseqtvnvlaqilkrlnperkmindkkhfslke 


6585 


3 


1678 


GP I RNS R IDD F VGGD PRAE AS CS VLHS K PHAMADS RD PAS DQMQ 1 

hwkeqraaqkadvr ittgagn p vgd klnvitvg prg pllvqdwf 
tdemahfdrbriperwhakgagafgyfevthditkyskakvfe 
higkxtpiavrfstvagesgsadtvrdprgfavkfytedgnwdl 
vgnntpiffirdpilfpsfihsqkrnpqthlkdpdmvwdfwslr 
peslhqvsflfsdrgipdghrhmngygshtfklvnangeavyck 
fhyktdqgiknlsvedaarlsqedpdygirdlfnaiatgkypsw 

TFYIQVMTFKQAETFPFNPFDIjTKVWPHKDYPLIPVGKLVLNRN 

pvnyfaeveqiafdpsnmppgieaspdkmjlqgrlfaypdthrhr 
lgpkylh ipvncpyrarvanyqrdgpmcmqdnqggapnyypns f 
gapeqqpsalehsiqysgevrrfntanddnvtqvrafyvnvlne 
eqrkrlceni aghlkdaqi f iqkkavknftevhpdygshiqall 
dkynaekpknaihtfvqsgshlaarekanl 


6586 "■ 


32 


804 


PLPEQPAESTSTMPVSGTPAPNKKRKSSKIjIMELTGGGQESSGL • 

nlgkkisvprdvmleelslltnrgskmfklromrvekfiyenhp 

u v 1? bMUn FQ KFL PTVGGQliGTAGQGFS YSKSNGRGGSQAGG 

sgsagqygsdqqhhlgsgsgaggtggpagqagrggaagtagvge 
tgsgdqaggegkhitvfktyispweramgvdpqqkmelgidlla 
YGAKAELP kyks fnrtamp ygg yekas krmtfqmp kv 


658^ 


75 


1117 


rrvpslgkmpecwdgehdietpygllhwirgspkgnrpailty 
hdvglnhklcfntffnfedmqeitkhfwchvdapgqqvgasqf 
pqgyqfpsmeqiiaamlps wqh fg fkyvigi gvgagayvlakfa 
li fpdlveglvlvni dpngkgw i dwaatklsglts tlpdtvlsh 
lfsqeelvnntelvqsyrqqignwnqanlqlfwnmynsrrdld 
1nrpgtvpnaktlrcpvmlvvgdnapaedgvvecnskldptttt 
flkmadsgglpqwqpgklteafkyplqgmgympsasmtrlars 
rtas lts as 3 vdgs r ?qacths e ss eg lgqvnhtmevs c j 
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SEC- 
ID 
NO: 

. 6588 ' 


J Predicted 
i weyinning 
1 nucleotide 

location 
1 corresponding 

to first 
1 amino acid 
I residue of 
1 amino acid 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A«Alanine, CCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine, 
HoHistidine, I«*Isoleucine, K^Lysine, 
L=> Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=* Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 




137 


501 


LGLQAQLLK^RTNNyQLSDBLRKNGVELTSLRQKVAYLDKEFSK 
AQKALS KS KKAQE VE VLLSENEMLQAKLHS QEEDFRLQNSTLMA 
E FS KLCSQMEQLEQENC3QLKEGAAGAGVAQAGP 


6589 
6590 


j 2. 


1405 


RPWGSAMATFSRQEFFQQLlQGCLLPTACjQGLDQIWLLLAICIA 
CRLLWRLGL P S YLKHAS TVAGG F FS L YH FFQLHMVW WL LS LLC 
YLVLFIX:RHSSHRGVFLSVTILIYLLMGFJ4HMVDTVTOHKMRGA 
OM I VAMKAVSLGFDLDRGE VGTVPSPVEFMG YLYFVGTI VFGP W 
I S FH S YLQAVQGR PLS CRWLQ KVARS LALALLCL VLSTCVG P YL 
FPYF I PLNGDRLLRNKKRKARGTMVRWLRAYESAVSFHFSNYFV 
GFLSEATATLAGAGFTEEKDHLEWDLTVSKPLNVELPRSMVEW 
TSWWLPMS YWLNNYVFKNALRLGTFSAVLVTYAASALLHGFS FH 
LAAVLLS LAF I TYVEHVLRKR LAR ILSACVLSKRCPP DCSHQHR 
LGLGVRALNLLFGALAI FHLAYLGSLFDVDVDDTTEEQGYGMAY 
TVHKWSELS WASHWVTFGCWI FYRLIG 




j 2177 


656 


VRAYEHVLSLLENVFTPMFCHRDEYFRQLLRGAESPTRNSKLKR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMED0SPVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPS S ERKEK KER I P VFCI DVERNDRRAVGHE PEHWS VYR 
R YLE FYVLBSKLTEFHGAFPDAQLPSKRI IGPKNYEFLKS KREE 
FQEYLQKLLQHPELSNSQLLADPLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSE1WKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMYVGRW FQVPD WLHHLLMGTR I LPKNTLEM YTD YYLOCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VI QELFPE LNKVQKE VTS VTS WM 


6591 


2177 


6S6 


VRAYEHVLSJjIjENVFTPMFCHRDEYFRQLLRGAESPTRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 

gvaegeddfieegiwmeddspveavstpntprnlaawkisipy 

VDFFEDPSSBRKEKKERI P VFCI DVERNDRRAVGHE P EHWS VYR 
RYLEFYVLESKLTEFHGAFPDAQLPSiCRIIGPKNYEFLKSKREE 
FQEYLQKLLQHPBLSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLMYVGRWFQVPDWLHHLLMGTRILFKNTLEMYTDYYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEORLKAIVAEKFAIATKEG 
DLPQVERFFKIFPLLGLHEEGLRKFSEYLCKQVASKAEENLLMV 
LGTDMSDRRAAVIFADTLTLLFEG1ARIVETHQPIVETYYGPGR 
LYTLIKYLQVECDRQVEKWDKFIKQRDYHQQFRHVQNNLMRNS 
TTEKIEPRBLDPILTEVTLMNARSELYLRFLKKRISSDFEVGDS 
MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVTMEEYFMRE 
i vwivrtv/ujux i cMjQJbTSSMVDDVFYIVKKCIGRALSSSSIDCL 
CAM INLATTEL ES DFRJDVLCNKLRMG F PATT FQD I QRGVTS AVN 
IMHS S LQQGKFDTKG I ES TDEAKMS FLVTLNNV E VCS EN I S T LK 
KTLES DCTKLFSQG I GGEQAQAKFDSCLSDLAAVSNKFRDLLQE 
GLTELNSTAIKPQVQPWI NS FFSVSHNI EEEE FNDYEANDPWVQ 
QFILNLEQQMAEFKASLSPVIYDSLTGLMTSLVAVELEKWLKS 
TFNRLGGLQFDKELRSLIAYLTTVTTl^IRDKFARLSQMATILN 

LERVTEILDYWGPNSGPLTWRLTPAEVRQVLALRIDFRSEDIKR 
LRL 


6593 j 


3 


1837 


EAFSAGSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR ' 
H.GGEGGGGRGRGDKRRRRQARRQRRRPE PAEARGGKMADVLS VL 
RQYNIQfCKEIVVXGDEVIFGEFSHPKNVKTNYVVWGTGKEGQPR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine, Kabysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R^Arginine, 
S^Serine, T=Threonine, V=valine. 
W=Tryptophan, Y=Tyrosine, X=Unkno*m, *=Stop 
Codon, /=possible nucleotide deletion 
\=possible nucleotide insertion) 








EYYTLDSILFLUmVHLSHPVYVRRAATENIPWRRPDRKDLLG 
YLNGEASTSAS I DRSAPLE IGLQRSTQVKRAADEVLAEAKKPR I 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAKIMAKKRSTI KTDLDDDITALKQRS FVDAEVDVTRDIVSRE 
RVWRTRTTI LQSTGKNFSKNI FAI LQS VKAREEGRAPEQRPAPN 
AAPVDPTLRTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKSVTEGASARKTXJTPAAQPVPRPVSQARPPPNQKKGSRTP 
III IPAATTSLlTMLNAKDI i LODL.fTPVP<;nPK'W , vnr , r , notrKiwrT 

I QRRKDQMQPGGTAI S VTVPYRWDQPLKLMPQDWDRWAVFVQ 
G PAWQFKGWPWLLPDGS PVDI FAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELSYHKRHLDRPVFLRVWETLDRYMVKHKSHLRF 


6594 


1 


1096 


EFPGRRFRGSQASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
DILSTIGYDNIIQHLNNGRKNCKEFEDFLKBRAAIEERYGKDLL 
«J-toj\ivrnr^\jv*«» ±sy 4 ijJs_to\j_ifci v £ IvvwVUNVAQCHIQIiAQSLREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
QKCRDKDEAEQAVSRSAKLVNPKQQEKLFVKLATSKTAVEDSDK 
AYMLHIGTLDKVREEWQSEHIKACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCSIQRDIEYFVNQRKTGQI 
P PAP r MYENFYSSQKNAVPAGKATG PNLARRGPLPI PKS S PDDP 
; NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWLYLH 

R.YNJ1VPCPOPVT CT.CfTVPMT C\TJ /Tr/"»KTT'JC»-rivf>iT»r»nT T nnur nt«*s. 

** * ***** raaya * vi_iy i (JNWr INARRRLLPDMLRKD 
G KD PNQFT 3 S RRGGKAS D VALPRG S S PS VLAVS VP APTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTLTLLTRAEA 
GS PTGGLFNTPP PTPPEQDKEDFS S FQLLVE VALQRAAEMELQK 
QQDPS LPLLHTP IPLVS ENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
P I YQLNAP WLKGQERADL SNSLE E I Y I QN IGES I L YLWVEKI RD 
VL I QKS QMTEPG PD VKKKTE EED VE CE DDL I LACQ PES S VKALD 
FD I SETRTEVE VEELPP IDHGI P I TDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNI YAYR I YCEDKQT FLQDCEDDGETA 
AGGRLIjHLMBILWKNVMVVVSRWYGGILLGPDRFKHINNCARN 
I LVEKNYTNS PEESSKAIjG KNKKVRKDKKRNEH 


6597 


2 


1026 


PRLPVRRYBGRRRLC<3RSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VLlQKSQMTEPGPDVKKKTEEEDVECEDDIiIIiACQPESS VKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNIYAYRIYCEDKQTFLGDCEDDGETA 
AGGRLLHLMEILNVKIWMVVVSRWYGGILLGPDRFKHINNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6596 


1099 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLS FCRLHKQS S MT VMEAQ ES PLFNNVKLQRKL P VE S IQI VL EE 
LRKKGNLEWLDKS KSSFLI MWRRPEEWGKL I YQWVSRSGQNNSV 
FTLYELTNGEDTEDEEFHGLDEATLLRALQALQQEHKAEIITVS 
DG PRRQ VLLAGTCLPLLLTS H LS RAFKRRQTQC P P KTGS VT P PD 
S KG LQS 


6599 


164 


1593 


KMAALTTLFKYIDENQDRYIKKLAKWVAIQSVSAWPEKRGEIRR 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILLGRLGS 
DPQKKTVCIYGHLDVQPAALEDGWDSEPFTLVERDGKLHGRGST 
DDKGPVAGWINALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
LIFARKDTFFKDVDYVCISDNYWLGKKKPCITYGLRGICYFFIE 
VE CSNKDLHSG VYGGS VHEAMTDL I LLMG S L VDKRGN I LI PGIN 
EAVAAVTF.EEHKLYDDIDFDI EEFAKDVGAQ I LLHS HKKDILMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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i SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide — 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








VVGEQVTSYLTKKPABLRSPNEFKVYMGHGGKPWVSDFSHPHYL 
AGRRAMKT VTGVEPDLTREGGS I PVTLTFQEATGKNVNLLP VGS 
ADDGAHS ONE KLNRYNY I EGTKMLAA YL YBVS QLKD ' 


6600 




934 


PGRLFRVAAMESAGLEQLLRELLLPDTERIRRATEQLQIVLRAP 
AALSALCDLLASAADPQIRQFAAVLTRRRLNTRWRRLAAEQRES 
LKSLILTALQRETEHCVSLSLAQLSATIPRKEGLEAWPQLLQLL 
QHSTHSPHSPEREMGLLLLSVWTSRPEAFQPHHRELLRLLNET 
LGEVGSPGLLFYSLRTLTTMAPYIiSTEDVPLARMLVPKLIMAMQ 
TL I P I DEAKACEALE ALDE LLES E V? VITP YLS E VLTFCLE VAR 
NVALGNAI R I R I LCCLTF LVKVKS KALLKNR LLATLAAH P ^PHC 
GC 


6601 


529 


1420 


P RAAARA P P PAVLRR DRRAATAPGAGE MTLHG PLAQR Y FLNH I E 
KITTWQDPRKAMNQPLNHMNLHPAVSSTPVPQRSMAVSQPNLVM 
NHQHQQCMAPSTLSQQNHPI^NPPAGI^SMPNALTTQQQQQQKL 
RLQRIQMERERIRMROEELMRORAAIiC'RnT.PMF&PTT novnftivu 

NPPTMTPDMRSITNNSSDPFLNGGPYHSREQSTDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGarPMNINPQQTRFPDFLDCLPGT 
NVDLGTLESEDL I PLFNDVES ALNKS E PFLTWL 


6602 


127 


617 


lld fpalp kfvlaqs pkag kpstmts mtqswTevi KAKTKARNF 

BRVLGKITLVSAAPGKVICEMKVEEEHTNAIGTLHGGLTATLVD 
NI STt4ALLCTERGAPGVSVDMNITYMSPAKIjGFDTVTTRinrr vr» 
GKTLAFTSVDLTNKATGKLIAQGRHTKHLGN 


6603 


79 


660 


PVGPSSLAARTGIiGHLPFLHRLASSRGLDMDLLQFLAFI.FVLLL 
SGMGATGTLRTSLDPSLEIYKKMFEVKRREQLLALXNLAQLNDI 
HQQ YK I LD VML KGLFKVLE D S RT VLTAAD VLPDG P FPQDE KLKD 
AFSHVVENTAFFGDWLRFPRIVHYYFDHNSNWNLLIRWGISFC 
NQTGVFNQGPHSPILSLM 


6604 


3 


668 


GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERWLLGEFL 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEIFGQLR 
DFYFSVKLSENMKASSFKKLQKFYIDPYKtiLPLQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


348 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDRDUDDVIjEKAKKANW 
ALVAVAEHSGE FEKIMQLSERYNGFVLPCLGVHPVQGLP PEDQR 
SVTLKDLDVALPIIENYKDRLLAIGBVGIjDFSPRFAGTGEQKEE 
QRQVLIRQIOLAKRLNLPVNVHSRSAGRPTINLLQEQGAEKVi,L 
HAFDGRPS VAMEGVRAGYFFS IPPSII RSGQOKLVKQLPLTS I C 
LETDS PALGPEKQVRNEPWNI S I SAB YI AQVKGIS VEE VI E VTT 
QNALKLFPKLRHLLQK 


6606 


2 


1682 


FVE IR PRAE VANLS AHS AS P I QDAVLKRLSLLEDI V YRQLNGLS 
KS LGL I EG YGGRG KGGLPATL S P AE EE KAKG PHEKYGYNS YLS E 
KISLDRS I PDYRPTKCKELKYSKDLPQIS 1 1 FIFVNEALSVI LR 
SVHSAVNHrPTHLLKEIILVDDNSDEBELKVPLBEYVHKRYPGli 
VKWRNQKREGL IRARI EG W KVATGQVTG FFDAHVE FTAG WAE P 
VLSR 1 QENR KRV I LPS I DNI KQDNFE VQRYENSAHGYS WELWCM 
YISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKFFGEIGLLDPG 
MDVYGGENIELGIKVWLOGGSMEVLPCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAE VWMDDYKS HVY I AWNLPL ENPG I D I GDVSE R 
RALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQGPLE^mTAILYPCHGWGPQlARYTKEGFLHLGAI^TTrLLP 
DTRCLVDNSKSRLPQLLDCDKVKSSLYKRWNFIQNGAIMNKGTG 
RCLEVENRGLAG IDLI LRSCTGQRWTI KNS I K 


6607 


137 


986 


VPACAGLKKEARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
GISFCGRGGAGPGVPTRTQVFAAMGAVMGTFSSLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, QsGlutamine, RoArginine, 
S= Serine, T» Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








SGVVNElHTKQIYAQFFPHGDASTYAHYLFNAFDTTQTGSVKPfi 
DPVTAbSILLRGTVHEFCLRWTFNLYDINKDGYINQEEMMDIVKA 
IYDKT4GKYTYPVLKEDTPRQHVDVFFQKM>KNKDGIVTLDEFLE 
SCQEDDN I MRSLQLFQNVM 


6608 


224 


1140 


RPCFSSPTGIiCPRIiSYPMILLQHAVLPPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPLAGEEELSKGGEODCALEELCKPLY 
CKLCNVTLNSAOOAOAHYQGKNKGKKLRNYYAANSCPPPARMSN 
WE PAATP WP V P PQMGS F K PGGR V I LATEN DYCKLCDAS FS S P 
AVAQAH YQGKNHAKRLRLAEAQSNS FS ESSELGQRRARKEGNEF 
KMM PNRRNMYTVQNNSGP YFNPRSRQR I PRDLAMCVTFSGQF YC 
SMCNVGAGEEME FRQHLES KQHKSKVS EQRYRNEMENLG YV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAEPGNFQLS P AEPRG P LAS PVRAAP RA PC PAAEMS ELNT KTS 
PATNOAAGOEEKGKAGNVICKAFFPPPTnTnT.TaDirTtrvaiiT n-m 
GKFRR FQKR KKD PS S 


6610 


319 


881 


GRKSLCNLHIPIRFPLTYPDMYMGMMCTAKKCGIRFQPPAIILI 
YESE I KGKI RQR IMPVRNFS KFSDCTRAAEQLKNN PRHKS YLEQ 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKL 
DDKELAKRKS IMDELFEKNQKKKDDPNFVYDI E VEFPQDDQLQS 
CGWDTESADBF 


6611 


978 


212 


PGCSGAGSRVW WIjPALRHLAMGSTESS EGRHVS FGVDEEE RVRV 
LQGVRLSENWNRMKEPSSPPPAPTSSTFGLQDGNLRAPHKEST 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKAS LPTGEGS I SHBEQKS VRLARELESREAELRRRDTFYK 
EQLERIERKNAEMYKLS < 3EOFHRAA<2lCMFQTTVDDm/FD\7r , cr , r 
QAQ I LHCYRDRPHEVLLCS DLVKAYQRCVSAAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM " 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKLSTSAKR IQKEltAE I TLD P PPNCSAG PKGDNI YEWRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVI CLDI LKDNWSPALTI S KVLLS I CSLLTDCNPADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6*13 " 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSLVLRFVKGTFRESYI 
PTVE DT YRO VI S CDKS I CTIjO I TDTTfi S Hn P D& mod t t c vn u a 

FI LVYS I TSRQS LEELKP I YEQ ICEI KGDV3S I P I MLVGNKCDE 
S P SR E VQS S EAEALARl'WKCAFMETS AKLNHNVKE L FQ ELLNLE 
KRRTVSLQIDGKKSXQQKRKEKLKGKCVIM 


6614 


3 


1191 


SSAAEAMRVLVRRCWGPPLAHGARRGRPSPQWRALARLGWEDCR 
DSRVREKPPWRVLFFGTDQFAREALRALHAARENKEEELIDKLE 
WTMPS PS P KG L PVKQYA VQS QL PVY EW PDVGSGE YDVGWAS F 
GRLLNEAL I LK FP YG ILNVHPS CLPR WRG PAP VI HTVLHGDTVT 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 
MLISVLKNL PESLSNGRQQ PME GATYA P K I SAGTSC I KWEEQTS 
EQIFRLYRAIGNIIPLQTLWMANTIKLLDIiVEVNSSVLADPKLT 
GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NGYLHPWYQKNSQAQPSQCRFQTLRLPTKKKQKKTVAMQQCIE 


"6615 


B32 


35 


GRVGAGASAMSELPGDVRAFLRBHPSLRLQTDARKVRCILTGHE 
LPCRLPELQVYTRGKKYQRLVRASPAFDYAEFEPHIVPSTKNPH 
QLFCKLTLR H I N KC PEHVLRHTQGRR YQRALCKYBECQKQG VE Y 
VPACLVHRRRRRBDQMDGDGPRPREAFKEPTSSDEGGAASDDSM 
TDLYP P EL FTRKDLGS TEDGDGTDD FLTD KE DEKAKPPREKATD 
EGRRETTVYRGLVQKRGKKQLGSLKKKPKSHHRKPKSFSSCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLV I TQI PA PR H LRNR P FS FS RGGLDS FS GSLST PS ICRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanlne, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine. G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M*Methionine, N=Asparagine. 
P=Proline, Q^Glutamine, R-Arginine, 
S=Serine, IVThreonine , V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
wuuu < / o possiDie nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFIjNLPGPlWLQPSPPPQSSP 
PPQPHPCHTCRGLVDSFNKGLBRTIRDNPGGGNTAWEEENLSKY 
KDSETRLVEVLEGVCSKSDFECHRLLBLSEELVESWWFHKQQEA 
fuijrvwijCbUbijKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCX3LGYFBAERNASHLVCSACF 
GPCARC5G PEESNCLQC KKGWALHHLKCVDIDECGTEGANCGAD 
QFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
uuvutut 1 o VLPGENKQC^NTEGGYRCICAEGYKQMEGICVKEQ 
I PES AGFFSEMTEDELWLQQMFFGI 1 1 CALATLAAKGDLVFTA 
I FIGAVAAMTGY WLS ERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVSLLELEDRLQCPICLEVFKESLMLQCGHSYCKGCIiVS 

LSYHLDTKVRCPMCWQAVDGSSSLPNVSIiAWVIEALRLPGDPEP 

KVCVHHRNPLSLFCEKDQELICGLCGLLGSHQHHPVTPISTVCS 

RMKEELAALFSEIiKQEOKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 


6618 


ST8 


136 


ixskvarrapnspafqndiyplvsaprAttaespwskvlqntqcr - 

NVPKMTSERS RI PCLS AAAAEGTGKKQQEGRAMATLDRKVPS PE 

aflgkpwsswidaaklhcsdnvdleeagkeggksrevmrlnkea 

WKYGT 


6619 


246 


842 


passevltaavmflllncivavsqnmgigkngdlprpplrnefr 
yfqrmtttssvegkqnlvimgrktwfsipkknrplkurinlvls 
rei.keppqgahflarslddal klter pelankvdmi wi vggss v 

YKEAMNHLGHLKLFVTRIMQDFESDTFFSEIDLEKYKLLPEYPG 
ILSDVQEGKHtKYKFEVCEKDD 


6620 


3 


1B79 


NSRVDDFVARARMAAENEASQESALGAYSPVDYMSITSFPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRL PS SS SB 
MGSQDGSPLRE TRKDP FSAAAAECSCRQDGLTVI VTACLTFATG 
VTVALVMQ I YFGDPQI FGX3GAWTDAARCTSLGIEVLS KQGSSV 
DAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHLIDFRES 
APGALREETLQRSWETKPGLLVGVPGMVKGLHEAHQLYGRLPWS 
QVLAFAAAVAQDGFNVTHDLARALAEQLPPNMSERFRETFIiPSG 
RPPLPGSLLHRPDLAEVLDVLGTSGPAAFYAGGNLTLEMVAEAQ 
HAGGVITEEDFSNYSAIiVEKPVCGVYRGHLVLSPPPPHTGPALI 
SALNILEGFNLTSLVSREQALHWVAETLKlAIiALASRLGDPVYD 
STITESMDDMLSKVEAAYLRGHINDSQAAPAPLLPVYELDGAPT 
twj v Li inu fUD F X VAM VSS LNQ P FG SG L I TP SG ILLNSQMLDFS 
WPNRTANHSAPSLENSVQPGKRPLSFLLPTWRPAEGLCGTYLA 
LGANGAARGLSGLTQVRFTPWLAFFSREPSCGLDCRCLSYLWLV 
SIPHAANMG 


6621 


1 ' 1 


662 


• VQG I TS YQQRLQALR KE KSRDAARS RRG KENFEF YELAKLLPL P 
AAI TSQtiDKAS I IRLT I S YLKMRDFANQGDP P WNLRMEGP P PNT 
SVKVIGAQRRRSPSALAIEVFEAHLGSHlLQSIiDGYVFAIiNQEG 
KFLYISETVSIYLGLSQVELTGSSVFDYVHPGDHVEMAEQIjGMK 
L P PGRGLLSQGTAEDGASSASS SS QS ETPE P WCFP PASDQFLL 


6622 


2 


319 


CiKASGAQEETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN'"" 
AEVGKS CI I KR YCEKRFVS KYLAT I G I D YGVTKVHVRDRE I KVN 
IFDMAGHPFFYEVRKPF 


6623 


1886 


189 


KALFEKVKKFRLHVEEGDILYAMYVRQT VLKVIKFLI 1 1 AYNS A" 
LVSKVQFTVDCNVDIODMTGYKNFSCNHTMAHLFSKLSFCYLCF 
VS I YGLTCL YTLYWLF YRSLRB YS FE YVRQETGFDDI PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNNEWTPDKL 
RQ KLOTNAHNRLELP L I MLSGL PDTVFE I TELQS LKLE 1 1 KNVM 
IPATIAQLDNLQELSLHQCSVKIHSAALSFLKENLKVLSVKFDD 
MR E LP PWM YGLRNIiEE LYLVGSLSHDISRNVTL ES LRDL KS LKI 
LS I KSNVS K I PQAWD VS S HLQKMCIHNDGT KL VMLNNL KKMTN 
LTELELVHCDLERIPHAVFSLLSLQELDLKENNLKSIEEIVSFQ 
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SEQ ■ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segment containing signal peptide 
(A=«Alanine, C=Cyateine, D-Aspartic Acid, E=« 
v»iucaiwc aciq, t =Pnenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide delebion, 
\-possible nucleotide insertion) 








HLRKLTVLKLWHNS ityipehi kkltslerlsfshnkievlpsh 

LFLCNKIRYLDLSYNDIRFIPPBIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKC5NHFEI 
LPPELGDCRALKRAGLWEDALFETLPSDVREQMKTE 


6624 


218 


1 / DO 


GS RRGGGS R I PAVSTH VAPGRS VLRP FASGALRLRS L VKALGGC 
RGRPSGLAHLSQETSHWRAKRSGRACLGDFPGEILRSFIMKCTA 
REWLRVTTVLFMARAI PAMVVPNATLLEKLLEKYMDEDGEWWIA 
KQRGKRAITDNDMQSILDLHNKLRSQVYPTASNMEYMTWDVELE 
R S AE SWAE S CLWEHG PAS LLPS I GQNLGAH WGR YR P PT FH VQS W 
YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AINLCHNMNIWGQIWPKAVYLVCNYSPKGNWWGHAPYKHGRPCS 
ACPPSFGGGCRENLCYKEGSDRYYPPREEETNEIERQQSQVHDT 
HVRTRSDDS S RNEVIS AQQMSQ I VS CEVR LRDQC KGTTCNR YEC 
PAGCLDSKAKVIGSVHYEMQSSICRAAIHYG1IDNDGGWVDITR 
CX3RKHYFIKSNRNGIQTIGKYQSANSFTVSKVTVQAVTCETTVE 
OLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF 


6625 


1124 


543 


pgprggggsllstkalgrsrglgMhpgpssggteggvptalrpp 

GPLVPSTSDDNLLKNIELFDKLALRFHGRLLFLKDVLGDEICCW 
SFYGQGRKIAEVCCTS I VYATEKKQTKVBFPEARIFEETLNILI 
YETPRGPDPALLEATGGAAGAGGAGRGEDEENREHRVRRIHVRR 
H I THDERPHGQQI VFKD 


662G 




1498 


SAVEFVYTDRFHLILGISVEFLCSLRSDATMESITACLHALUAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQI ICAAQEHVKEKRRSAEVDDGAAEKETIiPEFGEGKDTGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQILLED 
GSRL VS AAL VI LS EL PAVCS PEG S I S IL PTIL YLTIG VLRETA V 
KLPGGQLSSTVAASLQALKGILSS PMARAEKSRTAWTDLLRSAL 
TTI LDCWDP VDETHQELDE VSLLTAITVFILS TS PEVTT I PCLQ 
KRC I DKFKATLEI KDP WQ I KTYQLLHS I FQY PNPAVS Y P Y I YS 
LASCIMEKLQEIDKRKPENTAELEIFQEGIKVLETLVTVAEEHH 
RAQLVACLLPILISFLLDENSLGSATSIKRNLHDFALQNLMQIG 
PQYSSVFKSLVASSPALKARLEAAIKGNQBSVKVKIPTSKYTKS 
PGKWSS IQLKTSFL 


6627 


1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL " 
GDTGVGKTCFLIQFIO^AFLSGTFIATVGIDFRNKVVTVDGVRV 
KLQ I WDTAGQE R FRS VTHAY YRDAQALLLL YD I TNKS S FDN I RA 
WLTE IHE YAQRD WIMLLGNKADMS S ER VIRS EDGETLARE YG V 

PFLETSAKTGMNVELAFLAI AKELK YRAGHQADE PSFQI RD YVE 
SQKKRSSCCSFM 


6628 


1 


1861 


gCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN " 

KEFGDSLSLEILQIIKESQQQHGLRHGDFQRYRGYCSRRQRRLR 

KTLNFKMGNRHKFTGKKVTEELLTDNRYLLLVLMDAERAWSYAM 

QLKQEANTEPRKRFHLLSRLRKAVKHAEELERLCESNRVDAKTK 

LEAQAYTAYLSGMLRFEHQEWKAAIEAFNKCKTIYEKLASAFTE 

EQAVLYNQRVEEISPNIRYCAYNIGDQSAINELMQMRLRSGGTE 

GLLAEKLEALI TQTRAKQAATMSEVEWRGRTVPVKIDKVR I FLL 

GLADNEAAIVQAESEETKERLFESMLSECRDAIQWREELKPDQ 

KQRDYI LEGEPGKVSNLQYLHS YLTYI KLSTAI KRNENMAKGLQ 

RALLQQQPEDDSKRSPRPQDLIRLYDIILQNLVELLQLPGLEED 

KAFQKEIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 

YANEVNS DAGA F KNS LKDL P D VQEL I TQVRSE KCS LQAAA I L DA 

NDAHQTETSS SQ VKDNKPL VE R FETFCLD PS LVTKQANL VHF P P 

GFQPIPCKPLFFDLALNHVAFPPLEDKLEQKTKSGL1X3YIKGIF 
GFRS 


6629 


5653 


4549 


GAT PLG S VGGR TG KMDAATLT YDTLR FAE FED F P ETS E P VW I LG 
RKYSIFTEKDEILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co rre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y*Tyrosine, • X«Unknovn, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLRCGQMI FAQALVCRHLGRDWRWTQRKRQPDSYFSVLNAFIDR 
KDSYYSIHQIAQMGVGEGKSIGQWYGPNTVAQVLKKLAVFDTWS 
SLAVHI AMDNTWMER1 R RLCRTS VPCAGATA FPADSDRH CNG F 
PAGAEVTNRPSPWRPLVLLIPLRLGLTDINSAYVETLKHCFMMP 
QSI^IGGKPNSAHYFlGYVGEELIYTiDPHTTQPAVEPTDGCFI 
P DES FHCQH P P CRMS I AEIiDPS I AVVRGGHLS TQAFGAE CCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCX3GIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQ YVKDE FR RH KTVR<5DF A.DR V TjTi v w vm v a "ra r »r .nn a m p m or\ 

NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFS1 
SBSMKPKF 


6631 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQIiHRVLPPDLKS"" 

TiflnnYVimRPPffmrnffiqT\PRr>PPT npt.TPtnmmi>T r Aw^ivxTtivmn 
*~*'V i rvrv.ni\. j. vuoudrtyKr Jjy dritUv XA1 ALiL(Q\JANENRQ 

N5TGKACFGTFL?EEKLNDFRDEQIGQLQEIiMQEATKPNRQFSI 
SESMKPKF 


" 6632 


1273 


588 


wnsrgrtqrgaaplapaaamkawqrVtrasVYVggeqisaigr 

GICVLLGI S LEDTQKELEHMVRKIIiNLRVFEDESG KHWS KS VMD 
KQYEILCVSQFTLQCVLKGNKPDFHLAMPTEQAEGFYNSFLEQL 
RKTYRPEL IKDGKFGAYMQVH IQNDGPVTIELESPAPGTATSDP 

KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


6633 


1145 


617 


muiujctvjvr a ijovt x x\j(jd viml>±_l i JrATIFSl/jPWGVLHSNPMDY 
AWGANGLDAIITQLLNQFENTGPPPADKEKIQALPTVPVTEEHV 
GSGLE CP VCKDDYALGERVRQLPCNHL FHDG C I VPW LEQHDS CP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGG I PRKGSGPRRRLPMARLRDCLPRLMbTLRS LLFWSLVYCYC 
GLCASIHLLKLLWSLGKGPAQTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPLMLLLHGFPEFWYSWRYQLREF 
KS EYR WALDLRG YGETDAP I HRQN Y KLDCL ITDI KDILDSLG Y 

TiRHPAQLLKSSYYYFFQTPWFPEFMFSINDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYIYVFSQPGALSGPINHYRNIFSCLPLKH 
* wnr tic v AnHo v 1 Kir I V ivTV x r KZja IJjSEASH 
WLQQDQPDI VNKL I WTFLKEETRKKD 


6635 ■ 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGIiGPHGPSFARVPVAPSSSSG 
GRGGAEPRPLPLSYRLLDGEAALPAWFLHGLFGSKTNFNSIAK 
1 LAQQTGRRVLTVDARNHGDS P HS PDMS YEIMS QDLQDLL PQLG 
LVP CVWGHSMGG KTAMLLALQRPEL VER LI AVD I S PVE S TG VS 
HFATYVAAAIRAINIADELPR^RARKIJIDEQLSSVIQDMAVRQHL 
LTNLVE VIXSRFVWRVNUtf^TQHLDKI LAFPQRQES YLGPTLFL 
LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRGFLV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 
QPIVRQCLQRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHDGTCVLDKAGSYKCACIJU3YTGQRCEP^LEAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRIIAKIGT 
WSFFCNNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 
LVRRRVLPMQVQSRETPLHQLYSAAFSKQKbQSAPTKKPALPFG 
DLPMG YQH LHTQLQ YEC IS P F YRRLG SS RRTCL RTG KWSGRAP S 
CI P I CGK I E MI TAP KTQGLRW PWQAA I YRRTSG VHDGSLHKGAW 
FLVCSGALVNERT\WAAHCVTDU5KVTMIKTADLKVVLGKFYR 
DDDRDEKTI QSLQIS AI ILHPNYDPI LLDADI AI LKLLDKAR I S 
TOVQPICLAASRDI^TSFQESHITVAGWNVLADVRSPGFKNDTL 
RSGWS WDSLLCEEQHEDHG I PVS VTDNMFCAS WEPTAPS DI C 
TAETGG I AAVS F PGRAS PE PRWHLMG LVS WS YDKTCSHRLS TAF 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine # D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leu cine, [^Methionine, N=Asparagine, 
P= Proline, Q=Glutaniine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDWIERNMK ~~ ~ 


663B 


1391 


224 


GG I PQAGGKMAAPWWRAALCECRRWRGFSTS AVLGRRTPPLGPM 
PNSD IDLSNLERLEKYRS FDR YRRRAEQEAQ APH WWRTYRE YFG 
EKTDPKEKIDIGIiPPPKVSRTQQliLERKQAIQELRANVEEERAA 
RLRTAS VPLDAVRAEWB RTCG PYHKQRLAE YYG LYRDLFHGAT P 
VP R VPLHVAYAVGE DDLM P VYCGNE VT PTE AAQAP BVTYEAEEG 
SLWTLLLTSLDGHLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 
PPFPARGSGIHRLAFLLFKQDQPIDFSEDARPSPCYQLAQRTFR 
TFDFYKKHQETMTPAGLSFFQCRWDDSVTYIFHQLLDMREPVFE 
FVR P P P YH P KQ KRF PHRQ PLR YLDR Y RDS HE PTYG I Y 


6639 


2046 


1268 


IGC FI MDGGDDGNL I IKKRF VSEAE LDERRKRRQEEWEKVRKPE 
DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDEVSRQQELIEKQRREBELKELKEYRNNLKKVGISQE 
NKKEVEKKLTVKPIETKNKFSQAKLIiAGAVKHKSSESGNSVKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGIIiPGL 
GAYSGSSDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6640 


117 


1043 


VLE P P DVSMAES E DRS LR I VI» VG KTG SGKS ATANT I LGE E I FDS 
RIAAQAVTKNCQKASREWOGRDLLWDTPGLFDTKESUDTTCKE 
ISRCIISSCPGPHAIVLVLLU3RYTEEEQKTVALIKAVFGKSAM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NS KKTS KAEKESQVQELVELI EKMVQCNEGAYFSDDI YKDTEER 
LKQREEVLRKI YTDQLNE E I KLVEEDKHKSEEKKEKEI KLLKLK 
YDE K I KN I RE E AERN1 FKD VFNR I W KM LS E I WHR FLS KC KF YS S 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGLAWSS 
AS AP P PRGFS A I S CTVEGAPAS FGKS FAQKS GY FLCL SS LGSLE 
NPQENWAD I QI WDKS PLPLGFS P VCDPMDS KAS VS KKKRMCV 
KLLPLGATDTAVFDVRL5GKTKTVPGYLRIGDMGGFAIWCKKAK 
APRPVPKPRGLSRDMQGLSLDAASQPSKGGLLERTASRLGSRAS 
TLRRNDSIYEASSLYGISAMDGVPFTLHPRFEGKSCSPLAFSAF 
GDLT I KS LAD I EE E YN YG FWEKTAAAR LP P S VS 


4642 


22 


1296 


PLEERMMTKMD PN DQAQRD 1 1 FELRR I AFDAES D PSNAPG S GTE 
KRKAMYTKDYKMLGFTNHINPAMDFTQTPPGMLALDNMLYLAKV 
HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 
NEGRNDYHPMFFTHDRAFEELFGICIQLLNKTWKEMRATAEDFN 
KVMQWREQ I TRALPS KPNS LDQ FKS KLRSL S YS E ILRLRQS ER 
MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 
NRRRQERFWYCRLALNHKVLHYGDLDDNPQGBVTFESLQEKIPV 
ADI KAI VTGKDCPHMKEKSALKQNKEVLELAFS ILYDPDETLNF 
I APNKYE YC I W 2 DGLS ALLG KDMS S ELT KSDLDTLLS M EMKLRL 
LDLENIQI PE AP PPI PKEPS S YDFVYHYG 


6643 


3049 


2265 


SLHAPAEGRTRGRLAEKPKMLTRKI KLWDINAH ITCRLCSGYLI 
DATTVTECLHTFCRSCLVKYLEENNTCPTCRIVIHQSHPLQYIG 
HDRTMQDI VYKLVPGLQEAEMRKQRE FYHKLGMEVPGDI KGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLE CNS S KLRG LKRKW I RCS AQ ATVLHL KKF I AKKLNLS S FNE L 
DI LCNEE I LGKDHTLKFVWTRWRFKKAPLLLH YRPKMDLL 


6644 


1489 


290 


FRPLATE PRGS S PVQLVSSTMSVRTLPLLFLNLGGEMLY ILDQR 
LRAQN I PGDKARKVLND I I STMFNRKFMEELFKPQELYSKKALR 
TVYERLAHAS I M KLNQ AS M D K L YDLMTMAFKYQ VLLCPR PKDVL 
LVTFNHLDT I KG F I RDS PT ILQQVDETLRQLTE I YGGLS AG EFQ 
LIRQTLLI FFQDLH I RVSM FLKDKVQNNNGR F VLP VSG PVPWGT 
EVPGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KLGTNMYSVNQPVETHVSGSSKNLASWTQES IAPNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
EVINIQATQDQQRSEBLARIMGEFEITEQPRLSTSKGDDLLAMM 
DEL 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I*=Isoleucine, K^Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 


653() 


4646 


FVEGLAGYVYKAASEGiCVLTI»AALLLNRSBSDIRYJULGYVSQOG ■ 
GQRSTPLI IAARNGHAKWRLLLEHYRVQTQQTGTVRFDGYVID 
GATALWCAAGAGHFE VVKLL VSHGANVNHTT VTNS TP LRAACFD 
GRLDI VKYLVENNANI S I ANKYDNTCLMI AAYKGHTDWRYLLE 
QRADPNAKAHCGATALHFAAEAGHIDIVKELI KWRAAI WNGHG 
MTPLKVAAESCKADWELLLSHADCDRRSRIEALELLGASFAND 
REN YD 1 1 KT YH YL YLAML ER FQDGDN ILE KE VLPP IHAYGNRTE 
CRNPQELES I RQDRDALHMEGLI VRERI LGADN IDVSHP 1 1 YRG 
AVYADNME FEQCI KLWLHALHLRQKGNRNTHKDLLR FAQVFSQM 
I HLNETVKAP DI ECVLRCS VIiE I EQS MNRVKN I S DAD VHNAMDN 
YECNLYTFL YLVCI STKTQCS E EDQCKI N KQ I YNL I HLD PRTRE 
G FTLLHLAVNS NT P VD DFH TNDVCS FPNALVT KLLLDCGAEVNA 
VDNEGNS ALHI I VQ YNRP I SDFLTLHS I I I S L VE AG AHTDMTNK 
QNKTPLDKSTTGVSEILLKTQMKMSLKCLAARAVRANDINYQDQ 
IPRTLEEFVGFH 


6646 


176 


eon 


PSSRMNHLPEDMENALTGSQSSHASI/RNIHS INPTQLMAR I ES Y 
EGRE K KG I S DVRRTFCLF VT FDLLFVTLLW 1 1 E LNVNGG I ENTL 
EKE VMQYDY YS S Y FD I FLLAVFR F KV LI LA YA VCRLRH W WAIAL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGI SDVRRTFCLFVTFDLLFVTLLWI I ELNVNGGI ENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDS EKPLLEL 


6648 


413 


897 


RNCWNCFTKY FNS PPED I DHKDS YLI TRS I MAEPDY I EDDNPElT"" 
IRPOKLINPVKTSRNHODLHRELLMNQKRGLAPQNKPBLQKVME 
KRKRDQVIKQKEEEAQKKKSDLEIELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


WIPRAAGIRHEVKWDVKEIMSQHNIYVDALLKEFEQFNRRLNEV" 
S KRVR I PLP VS N I LWEHC t R LANRT I VEG YANVKKCSNEGRALM 
QLDFQQFLMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWI 
KEHR E YSTKQLTNL VNVCLGSHINKKARQKLLAAI DDI DR PKR 


6650 


32 


765 


LVPLVFS LL VQS C KQ VYR S I AMKF VP CLLL VTL S CLGTLGQA PR 
QKQG STG EEFH FQTGGRDS CTMR PS S LGQG AG E VWLR VDCRNTD 
OTYWCEYRGQPSMCQAFAADPKSYWNQALQELRRLHHACQGAPV 
LR PS VCR EAGPQAHMQQ VTSS LKGS P E PNQQPEAGTPSLRPKAT 
VKLTEATQLGKDSMEELGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLI S FFRG 


66-51 


3425 


1353 


AKELL KVGDFSLCAGP YQNTADTMENLS KEPLAS FVSES FDISA 
CGI ATEHVKIDNS GEGLTAEAGSETLS RDGEVG VNSDMH YELSG 
DS DLDLLGDCRNPRLDLEDS YTLRGS YTR KKDVPTDG Y ES S LNF 
HNNNQEDWGCSS WVPGMETSLPPGHWTAAVKKEE KCVPP YVQI R 
DLHG I LRTYAN FS ITKELKDTMRTSHGLRRHPS FSANCGLPS S W 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLLVTWESDP 
R PQGQ PRRG YTAS S LDSS S S WRERCSHNRDLRNSQRNHT VS FHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSAS YEDI I IDVCTNLHVKLRS WKEA 
CKSTFL FY LVETEDKS FFVRTKNLLRKGGHTE I E PQH FCQA FHR 
ENDTLIIIIRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 
LDHTYQELFRAGG FVISDDKI LEAVTLVQLKEI I K I LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 
YHQCDSRSSTKAEILKCLLNLQIQHIDARFAVLLTDKPTIPREV 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\sspossible nucleotide insertion) 










6652 


2 


1343 


IPG5TISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PPPLFLPTRPAERAWIRSRRASEWVGKMEVPRLDHALNSPTSPC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEBLPVPHNIKISNI 
TCDS F KI S WEMDS KS KDRITH Y FI DLNKKENKNSNKF KHKD VPT 
KLVAKAVPLPMTVRGHWFLSPRTEYTVAVQTASKQVDGDYVVSE 
WSEIIEFCTADYSKVHLTQLLEKAEVIAGRMLKFSVFYRNQHKE 
YFD YVR BHHGNAMQ PS VKDNSG SHG S P I S G KLEG I F FS CSTEFN 
TGKPPQDSPYGRYRFBIAAEKLFNPNTNLYFGDFYCMYTAYHYV 
ILVIAPVGS PGDEFCKQRLPQLNS KDNKFLTCTEEDGVLVYHHA 
QD V I LEVI YTD P VDLS LGTVAE I TGHQLMS LSTANAKKD PSCKT 
CNISVGR 


6653 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
R VAAAAS RGADDAMES S KPGPVQVVLVQKDQHS FELDEKALAS I 
LLODHIRDLDWWSVAGAFRKGKSFILDFMLRYLYSQKESGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQSTVKDCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTE YG RLAMDE I FQKP FQTLM PL VRDWS F P YE YS YG L 
QGGMAFLDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
IiQVATSPDFDGKLKDIAGEFKEQLQALIPYVLNPSKLMEKEING 
SKVTCRGLLEYFKAYIKIYQGEDLPHPKSMLQATAEAYNLAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFKQLALDHFKKT 
KKMGGKDFSFRYQQELEEEIKELYENFCKHNGSKNVFSTFRTPA 
VL FTG I VAL YI ASGLTGF I GLE WAQLFNCM VGLLL I AL LTWG Y 
I R YS GQ YRELGGA I DFGAAYVLEQ AS SH IGNS TQATVRDA WGR 
PSMDKKAQ 


6654 


1 


705 


RTSLSPSQCSSFNLtAMASAGMQILGVVLTLLGWVNGLVSCAIiPM 
WKVTAFIGNSIWAQWWEGLWMSCWQSTGQMQCKVYDSLLAL 
PQDLQAARALCVIALLVALFGLLVYIiAGAKCTTCVEEKDSKAi^ 
VLTSG I VF V 2 SGVLTLI PVCWTAHAVI RDFYNPLVAEAQKRELG 
AS L Y LGW AASGLL LLGGGLLCCTC P S GGSQGPS HYMAR YS TS AP 
AISRGPSEYPTKNYV 


6655 


341 


16. 


KDAYMFKKGLLALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGA" 
I N I PLKE VKER I ATAVPDKNDTVKVYCNAGRQSGQAKE ILSEMG 
YTHVENAGGLKDIAMPKVKG 


6656 


2 


1212 


TELPPRPANLAIQPPLSPLRALAPLPEKPGAVP.PPQKRMAkVAk ' 
DLN PG VK FCMS LGQLQS ARG VACLGCKGTCSG FE PHS WRKI CKS C 
KCSQEDHCXTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 
KRNRM I MTN P I ATGKDPTFDTITYEWAP PGVTQ KLGLQ YME L I P 
KEKQPVTGTEGAFYRRRQLMHQLPIYDQDPSRCRGLLENEIiKLM 
BEFVKQYKSEALGVGEVALPGQGGIiPKEEGKQQEKPEGAETTAA 
TTNGSliSDPSKEVEYVCBLCKGAAPPDSP.WYSDRAGYNKQWHP 
TCF VCAKCS E PL VDL I YFWKDGAPWCGRHYCE S LRPRCSGC DE I 
ir+\r,u i urtv£jjJij/\wnKK-Hr VuhvaLEQL»LSGRAYIVTKGQLLCPr 
CSKSKRS 


6657 


83b 


2126 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPEYCEP 

LEHFTGQDLINLTQEDFKKPPLCRVSSDNGQRLLDMIETLKMEH 

HLEAHKNGHANGHLNIGVDIPTPDGSFSIKIKPNGMPNGYRKEM 

IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISVVHER 

VPPKEVOP"PLPDTFFDHFNRVQWAFSICEINGMILVGLWLIQWL 

LLKYKSIISPJ^FFCIVGTLYLYRCITMYVTTLPVPGMHFNCSPK 

h FGDWEAQLRR I M KLI AGGGLS I TGSHNMCGDYL YSGHTVMLTL - 

TYLFIKEYS PRRL WWYHWI CWLLS WG I FCI LLAHDHYTVD WV 

AYYIT^RLFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 

NVQGIVPRSYHWPFPWPWHLSRQVKYSRLVNDT 


6658 


35 


855 


HCCALGAPGS P YRG LY FS S AA PCTA PR KAKHQSTLEGLTKRM LM 
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Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
LaLeucine, M=Methionine. N=AsDaraoine 
P=Proline. Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, YaTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








FDPVPVKQKAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 

QTPEGLSHGIOMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 

SPGLSMPSSSPPIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 

IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 

MQVPVIESYEKPISQKKIKIEPGIBPQRTDYYPEEMSPPLMNSV 
SPPQALLOE 


6659 


18 


523 


K PQRGDCETW FQN CS LPKF VC F FCWG F W LW RAHS MSNLHSLPG L 

RGLTSISRNQUJCTNAMRVINNYQRRWKNQNTFLIATFANVVNV 
CGNPT I TCPHNRTLNNCHHSGVOVPT.M vrMT.TT dq dtim t e mpd v 

AQTPANMFYI VACDNRDQRRDPPQYP WPVHLHTI I 


6660 


514 


1707 


" » *a *? ui t i ah v vt tr o/tiNjjjjy/VAAVjAo/UtACDS VTSNV 

LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 
NGFKDQLCSLVFMALTDPSTOI.OT»VRTDTT.T\7rnnnDT\T t oven. 

LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VPKLAEELRVGESNLTNGDEPTQCSRHLCCLQALSAVSTHPSIV 
KETLPLLLQHLWQVNRGNMVAQSSDVIAVCQSLRQMAEKCQQDP 
E S CW YFHQTA I PC LLALAVQASM PE KE PS VLRKVLLE DE VLAAM 
VSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSRF 
Q P FQDGS SG QRRL I ALLMAFVCS LP RNVS EH I WE VLL FNLDK VT 
PG 


6661 


179 


430 


v * * w-w-»-j^» a uon x »»xj«c.miu y IC JJBlxftJUitiK Y I/iy AAKLM I GMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERODARYGGKGGARCC 


6662 


185 


4 23 


rslpkpapaqpasihcarfsgvtpptaktamsdg^'i'Xfnalmyc 
gpkadlx5n i fs acapassavkas vs vaqpgqavip 


6663 


3 


1005 


rpvlssrvi^dfvpplpetsgrrkklermysvdrvsddipirtwf 

PKENLFSFQTASTTMQAISNFRKHLRMVGSRRVKAQTFAERRER 

SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVMT"iT <S Q WDwrvTDTYTDDnnMrT -r 

P ILYDHEHATFED ILE E I ERKLNVYHKGAKI WKMLI FCQGGPGH 
LYLLKNKVATFAKVEKEEDMIHFWKRLSRLMSKVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRLYRLW KSRQHS KLLDFDDVL 


6664 


58 


968 


PRLLRLPRSVVVMOSPWDELALAFSRTSMPPFFDIAHYLVSVMA 
VKRQ PGAAALAW KNP I S S WFTAMLHC FGGG I LSCLLLAE P PLKF 
LANHTN I L LAS S I WY I T FF CPHDL VSQG YS YL PVQLLAS GMKEV 

TRTWKIVGGVTHANSYYKNGWIVMIAIGWARGAGGTIITNFERL 
VKGD WKPEGDE WL KMS Y PAKVT LT.G S VT FTT?n wro wr .a t e ituxtt 

MPLYTIFIVATKITMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
SCEKKSEAKS PSNGVGSLASKPVDVASDNVKKXHTKKNE 


6665 


171 


1278 


DERKLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQNPCSVHTATGPEPRKLPLLPPDSPNSGYPKEPA 
ALCPGI PSPCRMTHQDLS ITAKLINGGVAGLVGVTCVFP IDLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
E KA I KLAAND FFRRLLME DGMQRNLKMEMLAG CG AGMCQ WVTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 
ATLI AWELLRTQGLAGLYRGLGATLLRDIPFS 1 1 YFPLFANLNN 
LGFNELAGKAS FAHS FVSGCVAGS I AAVAVTPLDVLKTRIQTLK 
KGLGEDMYSGITDCAR 


6666 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWO/TIHQPFIVSCLTL 
WS PG CW PQPI Q KEG VGLWD I R KPQS S LLRYGG MLS LQS AMS VR F 

NSNGTQLLALRRRL PPVLYD IHSRLPVFQFDNQVYFNSCTM KSC 
CFAGDRDQYILSGSDDFNLYMWRIPADPEAGGIGRWNGAFMVL 
KGHRSIVNQVRFNPHTYMICSSGVEKIIKIWSPYKQPGCTGDLD 
GRIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
PFDSLVRREIBGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
3ASL PRS PP PTVDESADNAFHLGPLRVTTTNTVASTPPTPTCED 
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S2Q 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA=Alanine, C=Cysteine, D-Aspartic Acid, E* 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine . K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AASRQQRLSAIiRRYQDKRLLALSNESDSEENVCEVEIiDTDLFPR 
PRSPSPEDESSSSSSSSSSEDBEEI^NERRASTWQRNAMRRRQKT 
TRED KPSAP I KPTNTY I GEDNYD YPQI KVDDLSSS PTSS PERS T 
STLEIQPSRASPTSDIESVBRKIYKAYKWLRYSYISYSNNKDGE 
TSLVTGEADEGRAGTSHKDNPAPSSSKEACLNIAMAQRNQDLPP 
EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKIiNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEBRSLETICANHNNGRLHPRPPHPHNNGQNLGELEW 
AYSSPGHSDTDRDNSSLTGTHiHKDCCGSEMACETPNAGTREDP 
TDT PATDS S RAVHGH SG li KRQR I E LSDTDS ENSSSEKKLKT 


6667 


171 


X310 


ABEVERIjy^lRSDSLVPGTHfPPIRRRSKFANLGRIPKPWKWRK 
KKSEKFKHTSAALERKISMRQSREELIKRGVLKEIYDKDGSLSI 
SNEEDSLENGQSLSSSQLSLPAI»SEMEPVPMPRDPCSYEVLQPS 
DIHDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSLV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQLQYGSHGQHLPSTTGSL 
PMHPSGCRM I DELNKTLAMTMQRLES S EQR VP CS TS YH S SGLHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHESDYEDSSCL 
YTREEEEEEBDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKNILPRQTDEERLELRQQIGTKL 


6668 


714 


358 


TLAVATGPALTLRCHVCTSSSNCKHSWCPASSRFCKTTNTVEP 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKLH 
NAAPTRTALAHS ALSLGLALS LLAVI LAPSL 


| 6669 


459 


1207 


KDEETRKDYDYMLDHPEEYYSHYYHYYSRRLAPKVDVRWILVS" 

VCAISVFQFFSWWNSYNKAISYIATVPKYRIQATEIAKQQGLLK 

KAKEKGIOJKKSKEEIRDEEENIIKNIIKSKIDIKGGYQKPQICD 

liLLFQIILAPFHLCSYIVWYCRWIYNFNIKGKEYGEEERLYIIR 

KSMKMSKSQFDSLEDHQKETFLKRELWIKENYEVYKQEQEEELK 

KK1ANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 


184 


594 


VARI*GEAAKMSSEPPPPYPGGPTAPLLEEKSGAPPTPGRSSPA 

VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM ■ 

PPGFYPPPGPHPPMGYYPPGPYTPGPYPGPGGHTATVIiVPSGAA 
TTVTV 


6671 


1 


763 


LPAEKPRSAPNMAGGRCGPQLTALLAAWIAAVAATAGPEEAALP 
PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPSCQQTDSEWEAF 
AKNGEILQISVGKVDVIQEPGLSGRFFVTTIiPAFFHAKDGIFRR 
YRGPGIFEDLQNYILEKKWQSVEPLTGWKSPASLTMSGMAGLFS 
ISGKI WHLHNYFTVTLGI PAW CS YVFFVI ATLVPGLSMDLVL* V 
ISQCNWDPPYRHVS + /RPSTNLGVHTAHTSEHLRL 


6672 


304 


1089 


APGSKPVQFMDFEGKTS FGMSVFNLSNAIt*3SGILGLAYAKAHT 
GVIFFLALLLCIALLSSYSIHLLLTCAGIAGIRAYEQLGQRAFG 
PAGKVWATVICLHNVGAMSSYLFIIKSELPLVIGTFLYMDPEG 
DWFLKGNLLI IIVSVLI ILPLALMKHLGYLGYTSGLSLTCMLFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
MFHS * LTG VLTQW P I MAFAFVCH PGGAG PS I TELCRA FQAQD 


6673 


1116 


1963 


LQIQTHHTHHGARVTHLGSHQLLANAGTMLCRQQSSSMAPAFSQ 
S VTCGP S PC VR KQES ATKCLH I G ACGS DLWARG WE QG *G * GLNV 
WI,CPCVAFHRGARPQAEEGGARWWSLVSSPWIPPNP*HSS1GAE 
NAVPRP*QG* KVNPSGQERQS \ WVLPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTQKPRFSDTGM 
FGAGHCHSS CDFTRKGAAGGPG 


6674 


1 


440 


hE FDYMCQYDYVEVRDGDNRDGQI I KRVCGNERPAP IQS IGSSL 
HVLFHSDGSKNFDGFHAIYEEITACSSSPCFHDGTCVLDKAGSY 
KCACLAG YTGQRCENLLE ERNCS DPG/ W PS QW VP ENN RG PW A YQ 
PTPC* IGTRVAFFLT 
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SKQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Arnxno acid segment containing signal peptide 
(A-Alanine,. C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HsHistidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=»Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ; 
\apossible nucleotide insertion) 


6675 


277 


1678 


GNWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCEMVLI DHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RJ^NTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSLKE KP P I SGKQS ILS VRLEQCPLQLNNPFWE YS KFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLrcWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFS TLALVEKYS S PG LTS KE SL FVR I NAAKG FSL IQVDNTKVTM 
KE I LLKAVKRRKGSQ KVSGSRADG VFEEDSQI DI ATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQ EQ 1G CCGAACAALRS * DS H KC * EG I S GD KVE I D P VTNQ 
KASTKFWIKQKP I SIDSDLLCAC\DLAEE 


667* " 


277 


1678 


GKW P TE RMAFLDN PT 1 1 LAHI RQSHVTSDDTGMCEMVL I DHD VD 
LBKIHPPSMPGD3GSEIQGSNGETQGYVYAQSVDITSSWDFG1R 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVR I NAAHGFSL IQVDNTKVTM 
KEI LLKAVKRRKGSQKVSGS RADGVFEEDSQI DI ATVQDMLSSH 
HYKS FKVSMI HRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQ IG C CGAACAALRS * DS H KC + EGI SGDKVE I DP VTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6677 


277 


1678 


GNWPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQIKCKNIQWKERNSKQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICMQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNE P IHKF 
GFSTLALVEKYS S PGLTSKESLFVRINAAHGFSLI QVDNTKVTM 
KE I LLKAVKRRKGSQKVSGSRADGVFEEDSQID I ATVQDMLSSH 
HYKS FKVSMIHRLRFTTDVQL/GCAL FPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS * DSHKC* EG I SGDKVE I DP VTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


F 6678 


221 


865 


gpsnqssgslslivtgcssyws*indt'ctiLrvlssnfgrq*lr 
pfpcsqlpmsqgclwhldcccpwvpyipgqqwrkg3qrmrn*qs 
llgsdqesvgledlcvfvnfllhvllglfp*phelfllpvvdlg 
flfplllqggchclvlpanlvsqapqigklscrlqthdlegsrn 
hhplflwgrwdavkhletvqsglaslgfvgqhtshgpp 


6679 


2 


786 


lefargampflgqdwrs pgqnwvktvdgwkrfldeksgs fvsdl 
ssycnkevynkenlfnslnyd/scsqeekeghae *qnqns \dfh 
qekwiyvhkgstkerhgyctlgeafnrldfstaildsrrfnyvv 
rlleliaksqltslsgiaqknfmnilexwlkvledqqnitlir 
ellqtlytslctlvkrvgksvlvgninmwvyrmetilhwqqqln 
niqitrvsgqaqpppgsgslhrdtgqtrqdfeftpvteesglf 


D DO U 


1498 


2951 


plctlplmpsalpgwagerwekqwpla/pgpgtwqtpvgsisee 
p\rknepdthcprgearpev*hlpkphspgsegaeiqtsa*alp 
/nqvsppqpm*gaeengdqrggkeeageelhrsssgltaapgf? 
evhrnlqtfpglpsrgggp/ggagtqgswapgeqpp/spllpas 
mqrsqaglpgweaglvespthhipalrpsgtnatgeafpsttcs 
sgp\ pappgptglrpgggsssgghg* * pglpvgkv\galgaaqd 
pqsqgrgptqgtvgtemllsglgsakacpaarpavp*lfsdpas 
tipkkgtrgfgegpgvlqernrwwgraqgftsadaagtappgv 
♦lpaplsqppgatepqvracgmappspgtsgrlvangrhpgpqv 
aqgcppgagcwgsqprgsqrcprtythsplghgrapcprrcwh* 
wqdppssprtgclpgiparqaysaprtrsrpgirtgraaygfir 
fqggggg 
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SEQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D= As par tic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

1j— LjGU cine M-Mpt"hi nn< r>o m jv — ^ 

f 1 1— licujiiWiiine , N=Asparagine , 
P=Proline, Q=Glutaroine, R^Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6681 


1163 


511 


1NYIYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSER^" 
KMTVGVLTQTVG P WS RPGA YLS KQLDG VS KG W P PCPRALAATAL 

LAQEADELTLRQNLNRKSPHA\WTLINTKGHH*LINARLTRYQ 

TLLCENPHKTIEV^NT/LNPaTT.T t \rrwctJirvTJXTr'T mn nnimn 
" ** 4rt * j-cvon if lav f m i v 1 isbP V KHNCLEVLDSVYS 

S R PNLR DH P * TS VDWEL YVDG SGFANPCKVTLKKETS P APVTPR 
S 


6682 


109 


1238 


TVLCGAMQVSSI^EVKIYSLSCGKSLPEWLSDRKJO^QKXDVD 
VRRRIELIQDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 
YQ LS LKFERCLDS E WTFE I LS DDYS K I VFLHNDRY I E FHSQSG 
FYYKTRI PKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAABNNVCD INS VHGLFATGT I EGRVECWDPRTRNRVGLL 
D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 
uuK&ur>.rLiuv KUHUY CjIj P I KS VHFQDS LDL I LS ADS R I VKMWNK 

NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6683- 


109 


1238 


TVLCGAMQVSSLNEVKI YSLSCGKS LPEWLSDRKKRALQKKDVb 
vkk K 1 bL» I QDFEMPTVCTTI KVS KDGQY I LATGTYKPR VR C YDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCDINS VHGLFATGTI EGRVECWDPRTRNRVGLL 
D \ A P * TVSQQ t QR * TS L PTI S ALK FN \ GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLP I KSVHFQDSLDLILSADSRI VKMWNK 
NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6664 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCVVAGFCQSPCPPGGRGREAPA 
PP \ SGRRHA+ R PA* WLGG PGGDSGGREEGGS / GELQRAMES KMG 

&u±rijuj.wxyiit'KWDQSTFLGRARHFFTVTDPRNLLLSGAOLEAS 
RNIVQNYR 


6685 


258 


1473 


KLLGDNFEGFCNKFELSDSENGSNS*QSPL\FDRLFDPDPQKVL 
QGVIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 
ECAWLGSIAMGTENNVKSLLDCHIIPALLQGLLSPDLKFIEAC 

LRCLRTIFT^PUTDTTITT T VTnj\TtrTI>UT XMur t nnnr.i«,«-..» 

"■ A - tx? iafV i**iiiiijijx 1UA1 VIPHLMALLSRSRYTQEYICQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQALKCFS 
VTJ\FENPQVSMTLVNVLVDGELLPQIFVKMLQRDKPIEMQLTSA 
KCLTYMCRAGAI RTDDNC I VLKTLPCLVRMCSKERLLEERVEGA 
ETLAYLIEPDVELQRIASITDHLIAMLADYFKYPSSVSAITDIK 
RLDH DLKHAHELRQAAFKL YAS LG AND ED I R KKVSLGEG R P P VL 
TASRQGVTST 


6686 


310 " 


927 


DSVTFDDLAVUFTPKEWTLLDPTQRNLYRDVMLENYKNLATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTSSGIQMIGSHNGGEVSDVKQCGDVSSEHSCLKTHVRTQN 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF* LQLTLGKSFH* SIHT 


6687 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTtsSTSNSGNETSGSST 
IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
SRDHRREDRVHYRSPPLATGEPVDNLSPEERDARTVFCMQLAAR 
IRPRDLEDFFSAVGKVRDVRIISDRNSRRSKGIAYVEFCEIQSV 
PLAIG LTGQRLLG VP 1 I VQASQAEKNRLAAMANNLQKGNGGPMR 
L YVGS LH FNI TE DMLRG I FE PFGKV 


6688 


1025 


1 


AEVPNYPRVFHKCPDSCWRFKFQPIQLC?PYILLSFSSEKPP"ISF~ 

SEPGLPR /S ATARMATAAAP PNSS I DLPSDSGMGFI S PAGDSLD 

LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 

STSVGSWAAFTSSSSSSTNRDVAGLDFSTVTTSVSGSLVPSRE 

VAVICGSKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 

TMSELEELFSLPSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 

VCQLWLADSDTGKLSDCQEWTVGDSGGLTCPELSLGRM*MSLL | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Prolin e ; Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=>Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSAVI FGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSAU3SSSRTS*STSS 
ABDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNSLTARQLAMSL*ATKF*RNACNPNCLSSKKSAL*LSLNQRF 
GGSASRKPGNISFNSQKCSALSYCCNFVIKPREVSVSSENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGOTFEYLKREHSLSKPYQGVGTGSSSLWNLMGNAMVMTQ 
YI R LT P DMQS KQGALWNR VPCFLRDWELQVHFK I HGQG KKNL\H 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQSQKKMIQAQESITLEDVAV 
DFTWEEWQLLGAAQKDLYRDVMLENYSNLVAVGYQASKPDALFK 
LEQG EQLWT I EDG IHS GACSDI W KVDHVLERLQS ES L VNRRKPC 
HEHDAFEN1VHCSKSQFLLGQNHDIFDLRGKSLKSNLTLVNQSK 
GYEI KNSVEFTGNGDS FLHANHERLHTAI KFPASQKLISTKSQF 
ISP KHQKTRKLEKHHVCSECGKAF I KKSWLTDHQVMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/GKGFIQKTCIiIAHQRFlITER 


6692 


178 


939 


WIKEGELSLWERFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSQGFNKLAETLRWCLNLGILEVTVYAFSIENFKRSKSEV 
DGLMDLARQKFS RLM E E KE KLQKHG VCI R vlgdlhllpldlq EL 
IAQAVQATKN YNKCFLNVCFAYTSRHE I SNAVREMAWGVEQG LL 
D PS D I S ESLLDKCLYTNRS PHPD I L IRTSGEVRLS DFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WIKEGELSLWERFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGH SQG FNKLAETLR W CLNLG I LE VT VYAFS I EN FKRS KS E V 
DGLMDLARQKFSRLMEEKEKLQKHGVCIRVLGDLHLLPLDLQEL 
I AQAVQATKNYNKCFLNVC FAYTSRHE ISNAVREMAWGVEQGLL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGS PG ISLPAAH IGFAAA 
LAVG PSGCHTEP \ FDE VWPS LFLGDAYAARDKSKLIQLGITH W 
NAAAGKFQVDTGAKFYRGMSLEYYGIEADDNPFFDLSVYFLP 


6695 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
E VHS LGQ I LPQDGLTAEAGPPEAQDPWGS PGISLPAAHIGFAAA 
LAVGPSGCHTEP\ FDEVWPSLFLGDAYAARDKSKLI QLGITHW 
NAAAGKFQVDTGAKFYRGMSLEYYGIEADDNPFFDLSVYFLP 


6696 




782 


PRVRGRVGERWAFLSVPAAMSSEMEPLLLAWSYFRRRKFQLCAD 
LCTQM LE KS P YDQ AAW I LKARA LTEM VYI DE I D VDQEG IAEMML 
DENA I AQ VPRPGTSLKLPGTNQTGG PSQAVR PITQAGRP I TG FL 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 
TSPIXSPFINLSRLNLTKYSOKPKIJXKAL.TFV'TPHHPMnwv'ra.T n 

LAALSTEHSQYKDWW WK/DQ IEKCYYRVGM YREAE KQI KSS 




' 3 


782 


PPLFLRRLNSRALRPGSRKVMAVVPASLSGQDVGS FAYLTI KDR 
I PQILTKVI DTLHRHKSEFFEKHGEEG VEAEKKAI S LLSKLRNE 
LQTDKPFIPLVEKFVDTDIWNQYLEYQQSLLNESDGKSRWFYSP 
WLL V\ E C YM YRR I HEA I\IQSPPI D YFDVFKES KEQN F YG S Q ES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQISLWGEISVDL 
SL\SGGESSSQNTNVLNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 


6*8 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP/PARRVLPRAMTASAQPRGRRPGVGVGWVTSCKHPRCV 
LLGKR KGS VGAGS FQLPGGHLE FG ETWE E CAQR ETW EEAALH LK 
NVHFASVVNSFIEKENYHYVTILMKGEVDVTHDSEPKjnrEPEKN 
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SEQ 
10 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
L»j.ucamic Acid, Phenylalanine, G=Glycme, 
H«=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\«poesible nucleotide insertion) 








ESKR I I YNHAF F FQBS KWSGGI LQ 


6700 


1098 


1392 


TQCWRSSTPGMRTHFRTQP/RLECGQGFSQQBNGHCMOTNECIQ 

FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYBPNEDGTACVERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPSPSPGLGPTSDKAAAPRTPKRRRLW 
RQRQ/HPAMLCYVTRPDAVLMEVBVEAKANGEDCLNQVCRRLGI 
IEVDYFGLQFTGSKGESLWLNLRNRISQQMDGIjAPYRLKIjRVKF 
FVEPHLILQEQTRHIFFLHIKEALLAGHLLCSPEQAVELSALLA 
QTKFGDYNQNTAKYNYEELCAKELSSATLNSIVAKHKELEGTSQ 
AS AE YQ VLQ I VS AMEN YG I EWHS VRDS EGQKLI* IG VGP EG I S I C 
KDDFSPINRIAYPVVQMATQSGKNVYLTVTKESGNSIVLLFKMI 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYSRDbKGHLASLF 
LNEN I NLG KKYVFD I KRTS KE V YDKARRALYNAG WDLVS RNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVbQEKLRKLKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAOLQVGESAAHFCLQPHLSL 
LLTGSRSQVLAR 


O IMC 


397 


1971 


PLAKFLKLDbVNVLCLPKEDVFLFYRTdFCSMGLGSSCHLSLPK 
RAEALLCSRKATWRDLVAVRMAEEQEFTQLCKLPAQPSHPHCV 
NNTYRSAQHSQALLRGLLALRDSGILFDWLWEGRHIEAHRIL 
LAASCD YFKGM FAGGLKEMEQEE VLIHGVS YNAMCQI LHFI YTS 
ELELSLSNVQETLVAACQLQIPEIIHFCCDFLMSWVDEEN1LDV 
YRLAELFDLSRLTEQLDTYI LKNFVAFSRTDKYRQLPLE KVYSL 
LS SNRLEVS CETE VYEG ALL YHYSL EQ VQADQ I S LHE P P KLLET 
VRFPLMEAEVLQRLHDKLDPSPLRDTVASALMYHRNESLQPSLQ 
SPQTELRSDFQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTASLAPRMSNQGIAVLNNFVYLIGGDNNVQGFRAESRCWRYDP 
RHNRWFQ IQSLQQEHADLS VC WGRYI YAVAGRDYHNDLNAVER 
YDPATNSWAYVAPbKREVYAHAGATLEGKMYITCGRKGRIT 


~S703 


45 


1244 


G VGPRAAAM PLELELCPGRWVGGQHPCF I IAE IGQNHQGDLDVA 
KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 
YGEHKRHLEFSHDQYRELQRYAEEVG I FFTASGMDEMAVEFLHE 
LNVPFFKVGSGDTNNFPYLEKTAK/TRGWHSVLRDVCGVQLNDE 
TS S WDVLGR VRTS KE KVUVIVLVLDYSGRPMVISSGMQSMDTMKQ 
VYQIVKPLNPNFCFLQCTSAYPLQPEDVNLRVISEYQKLFPDIP 
IGYSGHETGIAISVAAVALGAKVLERHITLDKTWKGSDHSAStjE 
PGELAELVRg VRLVERALGS PTKQLLPCEMACNEKLGKS WAKV 
KIPEGTILTMDMLTVKVGEPKGYPPEDIFNLVGKKVLVTVEEDD 
TIMEE 


6704 


82 


1007 


TMNTRNRWNSGLGASPA^RptRDPQDPSGRQGELSPVEDQREG" 
LEAAPKG PS RE S WHAGQRRTSAYTLIAPN I NRRNB IQR I AEQE 
IiANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQLQLMQSK 
YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFREHQQY KTAE FL/ RQTEHR IARQ KCLS KCCLW PT I IiN 
MGQKLGLO\DSLKAEENRKLQKMKDEQHQKSEIjLELKRQQQEQE 
RAKIHQTEHRRVNNAFLDRLQGKSOPGGLEQSGGCWNMNSGNSW 
GI 


6705 


2 


186 


RLCRNSARVPCGWSASRSLGEGAGFIGPLRGPHPRAGGTGTSFT 
S Y KRKGG I MS T I AAF YGGKS IL I TVATGFLGKE LME KL FRTS PD 
LKVIYILVRPKAGQTLQHRVFQILDSKLFEKVIEVRPNVHEKIR 
AI YADLNQNDFAIS KEOMQELLS CTNI IFHCAATVRFDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKH I DEVI Y 
PCPVEPKKIIDSLEW\LDDAIIDEITPKLIRDWPNIYTYTK 


6706 


130 


531 


PTHSSSSHSQEMLGKLNMLRNDGHFCDITIRVQDKIFRAHKVVL 
AACS DF FRTKLVGQAE DENKNVLD LHHVTVTG F I PLLE YAYTAT 
LSINTENIIDVLAAASYMQMFSVASTCSEFMKSSILWNTPNSQP 
BK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F-Phcnylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


yWSGIGYEbQHFHWRKFHFEKKGPPSTCQBRLYESRSRWPCIS* 
GMVWGWTAVNGSW* GGQLRCVCVCTSHSSDSTRSSQRAS KCHS 
FFILSQ*KT*SSWENVn/FAKYSRIYSYGHSCSKGRGD*DFK*NV 
SQAR * SR FCGLCNPCGHCGLDINLRGGSS PWTDKHSCVHNNLLC 
NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYA 
C * R CHWY FE WLL YNHCGD I LVACL + RRQL* S SQ 


6708 
^ 6709 


115 . 


1729 


TVGSWSRSGRSPPVGRQLLLTGRGAQAAGSPQGGMALQVELVPT 
GEI I RVVHPHR PCXLALGSDG VR VTMESALTARDR VG VQD FVLL 
ENFTSEAAFIENLRRRFRENIjIYTYIGPVLVSVNPYRDIiQIYSR 
QHMERYRGVSFYEEPPHLLAVADTVYRALRTERRDQAVMISVES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNSSRFGKYMDVQFDFKGAPVGGKILSYLLEKSRWHQ 
NHGERNFHIFYQLLEGCEEETLRRLGLERNPQSYLYLVKGQCAK 
VSS INDKSDWKWRKALTVI DFTEDE VEDLLS IAASVLHLGNIH 
FAANEESNAQVTTEKQLKYLTRIiLSVEGSTLREALTHRKIlAKG 
BELLS PLNLEQAAYARDALAKAV Y S RT FTKL VG K I NRS LAS KD V 
ES PS WRSTTVLG L LD I YG FE VFQHNS FEQFCIWYCNE KJLQQLF I 

ELTLKSEQEEYEAEGIAWEPVQYFWNKI ICDLVEEKFKGI I \SI 
LDE\ECLRPGE 




3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAA KM EKKVS KRSRKEE EDLEALI AH FQTLDAKRTQT VE L P CP P 
PSPRLNASLSVHPEKOELILFGGEYFNGQKTFLYNELYVYNIRK 
DTWTKVDI PSPP PRRCAHQAWVPQGGGQLWVFGGEFAS PNGEQ 
F YHYKDLWVLHLATKTWEQVKSTGG PS GRSGHRMVAW KRQL I LF 
GGraESTRDYIYYNDVYAFmiDTFTWSKLSPSGTX3PTPRSGCQ\ 
I PS LPRAAS S VYGGYS KQRVKKDVDKGTRHS DMF 


67X0 
" 6711 


158 


980 


RHKMTN YR VES S SGRAAR KMRLALMG PAF IAA I G Y I D PGN FATN 
IQAGAS FG YQLLWVWWANLMAMLI QI LSAKLG I ATGKNLAEQI 
RDHYPRPWWFYWVQAEIIAMATDLAEFIGAAIGFKLILGVSLL 
QGAVLTGIATFLILMLQRRGQKPLEKVIGGLLLFVAAAYIVELI 
FSQPNLAQLG KGM VI PSL PTSEA VFLAAGVL \ GAT IMPHVI / YT 
WH S SLTQHLHGGSRQQRYSATKWD VA I AMT I AG FVNLA I MATAA 
SELNFYGHTGVA 




3 


347 


VTE CKTMTC KMS QLERN I * TM I NTLHH YS VKLGHPDTL I HG EFK " 

elvrtdlhnilm:<enkndqai*hihedix>tnahmqiifkeliml 
mamltwsyhdkmhdadygpgqqhrpg 


6712 
^713 


lie 


57B 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDLRSWRLPPGENIDDWIAVHV 
VDFFNRINLIYGTMAERCS*TSCPVMAGGPRYEYRWQDERQYRR 
PAKLSAPRYMALLMDWIESLI 




2485 


3 


yARGSDSEDGEFEIQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 
QR KT I P VI LDGKD WAMARTGS GKTAC FL LPMFE RLKTH S AQTG 
ARAlilljSPTRELALg/rLKFTKELGKFTGLKTALlLGGDRMEDQF 
AALHENPDI I IATPGRLVHVAVEMSLKLQSVEYVVFDEADRLFE 
MGFAKQLQE I IAR I.PGGHQTVLFSATLPKLLVEFARAGLTEPVL 
IRLDVDTKI^EQLKTSFPLVREDTKAAVIiLHLLHNVVRPQDQTV 
VFVATKHHAEYLTELLTTQRVSCAHIYSALDPTARKINLAKFTL 
GKCSTLI VTDLAARGLDI PUjDNVINYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQSVVDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSRPAPSPESIKRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VDSIKNYRSRATIFEINASSRDLCSQVMRAKRQKDRKAIARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVEDIFS 
EVVGRKRQRSGPNRGAKRRREEARQRDQEFYIPYRPKDFDSERG 
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SKQ 
ID 

NO: 


Predicted ~" 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine. DWtepartic Acid, E« 

Glutamic Arid R-Phonviaian-inA • _ 

«*uLaiiufc /*uivi, r -fnenyi alanine, G=Glycme, 

H=Histidine, I=Isoleucine, K=Lysine, 

L=Leucine, M=Methionine, N=Asparagine, 

P=Proline, Q=Glut amine, R^Arginine, 

S=Serine, T=Threonine, V*»Valine, 

W=Tryptophan, Y=Tyrosine, X-Unknown, *»Stop 

Codon, /^possible nucleotide deletion, 

\-possible nucleotide insertion) 








LSISGEGGAFEQQAAGAVLDLMGDEAQNLTRGRCX2LKWDRKKKR 
rVGQSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 

LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLPPPPAPMAHIPSGGAPAAGAAPMGPQYCVCKVELSVS' 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
cjjvju- vLtUYHt EEVQKIjKFALFDgDKSSMRLDEHDFTjGQFSCSLG 
TIVSSKKITRPLLLLNDKPAGKGLITIAAQELSDNRVITLSLAG 
RRLD KKDLFG KSDP FLE FY KPG DDGKWMLVHRTEVI KYTLDPVW 

KPFTVPLVSLCDGDMEKPIQVKCYDYDNDGGHDFIGEFQTSVSQ 
MCE ARDS V P LE FEC INP KKQRKKKN YKNSG III LRS C KI NRD YS 
PLDYILGGCQLMFTVGIDFTASNGNPLDPSSLHYINPMGTNEYL 
S AI WAVGQI IQDYDSDKMFPALG FG AQLP P DW KVS HE FA I NFNP 
TNPFCSGVDGIAQAYSACLP 


S715 


32 


493 


f AijAC,i><ji) bHCLPAi. v UALiAGAAHS PHGGQ P PRRG PL I GSGMP 

GKP KH LG VPNGRM VLAVS DG ELS S TTG PQGQGEG RGS S LS I US L 

PSGPSSPFPTEEQPVASWALSFERLLQDPLGLAYFTEFLKKEFS 
*usw v i r WKACERcQQIPASDT 


6716 


1 . 


176 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAIWEQ 
HTVTLHRVSLCCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTS YS I DDQSQQSYDYGGSGGP YS KU YAG 
YDYSQQGRFVP PDMMQPQQP YTGQI YQ PTQAYTPAS PQPFYGNN 
FEDEPPLLEELGINFDHIWQKTLTVLHPLKVADGSIMNETDLAG 
PMVFCLAFGATLLLAGKIQFG YVYGI SAIGCLGMFCLLNLMSMT 
G VS FGCVAS VLG YCLL P M I LLSS FAV I FSLQGM VG 1 1 LTAG I IG 
WCSFSASKIF I S ALAMEGQQLLVA Y P CALLYGVFALI S VP 


6718 


290 


599 


KQSSTVPGTILPSLKWHJNSGLCKFPETGGKMTTFKEGLTFKDVA 
VIFTEEELGLLDPVQRNLYQDVMLBNFRNLLSVGHHPFKHDVFL 
LEKEKKLDIMKTATQ 


6719 


1 


691 


FJ'R P E 2QDREDG KCHKMEMN P ISGNLNCD P I AM SQCS S DHG CET 
v±ju&vvvkx fcAPNNFMKDS AS QDNGL SRKI SRKRVCSS DSDSSL 
QWKKSSKARTGLLRITRRCAATAANKIKLMSDVEDVSLENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDSEGSTKVLSQALNGDSDSEDMLNSEHKHRHTNIHKIDAPSK 
RKSSSVTSSG 


6720 


3 


B22 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA" 
VP I TE KSNPLTQDLDKADAEN I VRLLGQCDAE I FQE EGQ ALS T Y 

ORLYSESIIjTTMVrJVAn WOT?\7T.lrt7Dr»r , r , T inrr nnnnmn^Min yi 

^ •» i-io x u a xi i v^v*\oi\.vyci viji^yiJU^.LiVVL»SGGGTSGRMAF 
LMS VS FNQLM KGLGQKPL YTYL I AGGDRSWAS REGTEDS ALHG 
I EEL KKVAAG KKRV I VIG I S VGLS AP F VAG QMDCCMNNTAVFL P 
VLVGFNPVSMARHPFPPPRTT.PQr.TVPPCT d&duvattct t eio«* 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTTCPGTKRFQHVIETPEPGKWELTGYEAA 
VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFQEEGQALSTY 
QRLYSESILTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVSFNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VLVG FN PVSMARHP FPPPR I LRS LTVFPS LRAPHYQ I TSLLF5M 
SWTLISE 


6722 


1 j 


390 


RSWSKRTWQALPiyiAVLFLLLFLCGTPQAAnNMQAIYVALGEAVE 
LPCPSPSTLHGDEHLSWFCSPAAGSFTTLVAQVQVGRPAPDPGK 
PGRESRLRLLGNYS LWLEGS KEEDAGRYW CAVLGQHHNYQNW 


6723 


173 


659 


VCQYCTARMADFG I SAGQFVAWWDKSSP VEALKGLVDKLQALT 
GNEGRVSVENIKQLLQSAHKESSFDIILSGLVPGSTTLHSAEIL 
AE I AR I LRPGGCLFLKEPVETAVDNNSKVKTASKLCS ALTLSGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide"" 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutaraic Acid, F, Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine # 
L=Leucine, M=Methionine, N-Asparagine 
P^Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VeValine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6724 
6725 


173 


659 


VBVKELQREPLTPBEVQSVREHLGHESDNL " 

VCQ Y CTARMAD FG I SAGO FVAWWDKS S P VRAT va vnifmaT — 

GNEGRVSVENIKQLLQSAHKESSFDIILSGLVPGSTTLHSAEIL 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
VEVKELQRE PLTPEEVQS VREHLGHESDNL 


6726 


356 


722 


KKRTPPVIl^TMDDDLMLALRLQEEWNLQEAERDHAQESLSLVD 
ASWELVDPTPDLOJUjFVQFNDQPFWGQLHAVEVKWSVRMTLCAG 
ICSYEGKGGMCSIRLSEPLLKXRPRKDLVEVFFV 


6727 


98 


714 


^ * ca i\ Cr i\ d i cAjivnjNo LitL/l UQGKNCKSTIjMTIjNVG 

gylyitqkqtltkypdtflegivngkilcpfdadghyfidrdgl 
lfrhvlnflrngelllpegfrenqllaqeaeffqlkglaeevks 
rwekeqltprettfleitdnhdrsqglrifcnapdfi'skiksri 

VL VS KS R LDG F PRE P <? T <! CMT T n v v vu t v 


6728 


1 


831 


r KGMGDER PH Y YGKHGT PQKY D PT F KG P I YNRGCTD ll CC V FLL 

LA I VGYVAVG I IAWTHGDPRKVI YPTDSRGE FCGQKGTKNENKP 

YLFYFNIVKCASPLVLLBFQCPTPQICVEKCPDRYLTYLNARSS 

RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVLIPSKPLARRCF 

PAIHAYKGVLMVGNETTYEDGHGSRKNITDLVEGAKKANGVLEA 

RQLAMRIFEDYTVSWYMDIISLG^IAMSLLFIILLRFLAGIMG 
RGMI IMGILVLGY 


6729 


486 


935 


'lo-iiv^jxjrtiyoouovTiu'Jr ijvljijlC»C»IASGKSS vlQVFQQLGCA 
VIDVDVMARHWQPGYPAHRRIVEVFGTEVLLENGDINRKVLGD 

LIFNQPDRROLliNATTHP'PTOVPMMIfCTCWm VtVnrtrr.rynn^,,... 

HVPSALKEADSLMRRDT 


6730 


259 


1191 


VGLTQAQSGKTASMGRDQRAVAGPALRRWLIjLGTVTVGFLAQSV 
LAG VKKFDVPCGG RDCSGGCQCY PEKGG RGQ PG P VG PQG YNG P P 
GLQGFPGLOGRKGDKGERGAPnv/TRD icrT\\m'AT>n\Tcnr?n^n< r**-.* 
PGH PGQGGP RGR PG YDGCNGTQGDSG PQGP PGS EGFTGP PG PQG 
PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 
GP VG APGRPGP PGPPG P KG QQGNRGLG FYGVKG E KGDVGQ PGPN 

GIPSDTLHPHAPTGVTFHPDQYKGEKGSEGEPGIRGISLKGEE 

GIM • j 


6731 


784 


1015 


NMVDYYEVLGbQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEAYBVLSKTDEKRDIYDKYGTEGLNEF 


6732 


1 


446 


U XRKR LHGA WPR VE VdC P WETRES EG VHLER PTS PL KNNDEGS 
LDIYAGLDSAVSDSASKSCVPSRNCLDT.VT?PTT TTzrnvAvv*rmr 

NDIiQVEYGKCQLQMKELMKKFKEIQTQNFSLINENQSLKKNISA 
LIKTARVE I NRKDE E I 


"6733 ~ 


102 


1205 


urwqrrppppspplwclqpgggsdpqqltqlrhclshspWpw - 

AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQBN 
LGSIKPSSRSTKATSTTMAGDGRRAEAVREGWGVYVTPRAPIRE 
GRGRLAPQNGGSSDAPAYRTP PSROGRREVR F<?np D Dc\/vrtni?c« 

PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 

PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 

DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 

YBATSVQQKVNFSEEGETEEDDQDSSHSSVTTVKARSRDSDESG 
DKTTRSSSQYIESFW 


6734 


613 


1311 


RS CRQ VGMRS RNQGG ES ASDGHI S C PKPS 1 1 GNAGEKSLS E DAK 

KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLI 

QLLSIMEGELQAREDVIHMLKTEKTKPEVLEAHYGSAEPEKVLR 

VLHRDAILAQEKSIGEDVYEKPISELDRLEEKQKETYRRMLEQL 

LU^KCHRRrVYELENEKHKHTDYMNKSDDFTNLLEQERERLKK 
LLEQEKAYQARKE 




189 


551 


SAAWiTPVrSGCFgELQEKNKStELVSPBKVAVHFrWBBWQPLDD 
^QRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPWIVEE 
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ID 

i NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptiSe~ 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid. F=Phenvlalanine G-r:Wi-in« 
H=Histidine, I=>Isoleucine, K=Lysine, 
L=Leucine. M=Methionine, N=Asparagine, 
P=?roline, Q=Glutamine, R=Arginine, 
S=Serine ( T^Threonine, V^Valine, 
W=Tryptophan, YsTyrosine, X«=Unknown, *=Stop 
Codon, ./-possible nucleotide deletion, 
\«possible nucleotide insertion) 


6735 


280 


558 


TLNLRLSGGSKKQVFSGICHRSLVELQEVHLV 
KSRJU^VTWISNPFLKQVFNKDKTFRPKRKFEPGTQRFBLHKKA 

QASLNAGLDLRLAVQLPPGEDLNDWVAVHWDFFNRVNLIYGTI 
XDGCT 


6736 


195 


808 


MNYELNFKKKMPNIKSLGLTNLNFUiKRi^SVLPLITDYVyFEW 
SSSNP YLIRRI EELNKTASGNVEAKWCFYRPwnT cmtt tmt 

KHAKEIEBESETTVEADLTDKQKHQLKHRELFLSRQYESLPATH 
IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
EIRVGPRYQADIPEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPIiHFSPGDIVRPSPPVQQQPIfr PPWiuear DPvn^r.m^T n — 
* JfWA v ^ r ^^^voo3^j\jjKKWAH5KLjESYRPDTDLS 

REDTGCNLQHISDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 

R EKR KS LP 1NHH PPGQ I AR KYS S CS TI FLDDSTVS Q PN LK YT I K 

CVALAI YYHI KNRDPDGRMLLDI FDENLHPLSKSE VPPDYDKHN 

PEQKQIYRFVRTLFSAAQLTAECAIVTLVYLERLLTYAEIDICP 
ANWKRIVIXSAILLASKVWnnnavwrjvnvr'oTT imTmrrnuKTriT n 

RQFLELLQFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAHKLEAI SR LCEDKYKDLRRSARKRSAS ADNLTLPRWS PAIIS 


6738 


148 


£53 


CACAEQPARAEVGAATALPVRWASGEMAPSGSIiAVPLAVLVLLL 
WG AP WTHGRRSNVR VI TDENWRELIiEGDSiM I E FYAP W C PACQNL 
m-mrmztiwcuLtct vw x*\ts. vu v i JiyPQjjjSGRFI ITALPTI YHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWF 


6739 


3 


631 

^ 631 


SWPDMAEEEVAKLEKHLMLJbRQEYVKI^KKLABTEKRCALLAAQ 
ANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VIAARSDSWSXANLSSTKELDLSDAKrPEVTMTMLRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 
TAB E LNASTLMN YCAE 1 1 ASHWVS EVEGVNKAL 


6710 


3 




SWPDMAEEEVAVT.PKT4T.MT.T.Pni?VT/WT Av t/T iM?>nr>rmrnLt „ ., , 

w»ir*ytiri£,iiiiv/\rLU£.ivriiji»iJjLiKUC»l V KJjUKKliAbTEKRCALLAAQ 
ANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VLAARSDSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEF 
REDDVFLTEIjMKIANRFQIjQLLRERCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAEIIASHWVSEVEGVNKAL 


6741 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 

HSG I CTRTVQHQDS Q VNALE VTPDRS M I AAAVQ P VS LGYQH I RM 

YDLNSNNPNPI ISYDGVNKNT &<!Vr:PH7iv:DUMVTr<AT?nn>Ti> 

* A>3iui3vnjui AnO vkjC ttlllAjKWWi -TGGEDCTARI 

WDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIW 
DLKTDH N EQLI PEP E VS I TS AH I DP DAS YMAAVNS TLVP FS CLL 

PLAIGILQEGEFESLARRGLLFLACQGNCYVWNLTGGIGDEVTO 
LIPKTKIP 


6742 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNPI I S YDGVNKNIAS VGFHEDGRWMYTGGEDCTARI 
WDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIW 
DLKTDHNEOL I P EP E VS I TS AH I DP DAS YMAAVNSTL VP FS CLL 
PLA IG I LQEGE FES LARRGLL FLACQGNC YVWNLTGG I GDE VTQ 
LIPKTKIP 


6743 


1 


412 


M HS TQDKS LHLEGD PNPSAAP TS TCAPR KM P KR I S I S KQLAS VK 
ALRKCS DLE KAI ATTALI PRNS SDS DG KLE KA I AKDLLOTQ FRN 
FAEGQETKPKYRE I LS ELDEHTENKLD FE D FM I LLLS I T VMSDL 
LQNIR ■ 


6744 


95 


1343 • 


RTPARNRCAG CBVLS RFSS PNKAS S FALQS AGGGL PA VRALR RD 

rqkvstvgygmdeveqdqhbarlkelfdsfdttgtgslgqeelt 

DLCHMLSLEEVAPVLQQTLLQDNLLGRVHFDQFKEALILILSRT 

lsneehfqepdcsleaqpkyvrggkrygrrslpefqesveefpe 
vtviepldeearpshipagdcsehwktqrseeyeaegqlrfwnp 
ddlnasqsgssppqdwieeklqevcedlgitrdghlnrkklvsi 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D= As par tic Acid, E= 
biutanuc acaa, r =Pnenylalanxne, G=Glycine, 
H=Histidine, I=Isoleucine, K^ysine, ' 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, V= Valine, 
w-Aiypcopnan, x = lyrosxne, X= Unknown, *=Stop 
Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








\-o w i v* uyn v tAj tsr-jjjii a v r UN JjD PDGTMS VEDFFYGIjFKNGKSLT 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
lJiAjriOrtAo v C/KI J-iDTWyErJG I ENSQEl LKALDI^tLDGNINLTEL 
TLALENELLVTKNS I HQAC I 


6745 


1 


588 


TFRDQG W AQRRRWLLG CAS WES W EAAIAAG PGLP S S T ARQQNN P ~ 
AAGTEC FAAVWARGTAMG S VLS TDSG KS AP AS ATARALERRRD P 
nijfv lb rOCAVCLEVLHQPVRTRCGHVFCRSClATSLKNNKWTC 
PYCRAYLPSEGVPATDVAKRMKSEYKNCAECDTLVCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFQTTEI 
S LWTWAAIQAVEKKMESQAARLQSLEGRTGTAEKKliADCEKMA 
VEFGNQLEGKWA VLGTLLQE YG LLQR RLENVENLLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEELGLLDLAQRKLYRDVMLENFRNLLSVGH " 
Q P FH RDT FHFLRE E KF WMMD I ATQREGNS V Y AGVC 


6748 


201 


665 


mttfkeavtfkdvawfteeelglldpaqrklyrdvHLenfrnl " 
ls vgnq p fhqdtfh flgke kfwkmktts qregnsgg ki q i emet 

VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFFKEGDVPC 
QI EARLS I SXVQQXP YRCNBCKQ 


6749 


95 


719 


RREVKGGDGVCPRARGS PQSQQF PSCAGGGEGLQQSGEALDGAM 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLBKEFDKAF 
VDVDIiLLGEIDPDQADITYEGRQKMl'SLSSCFAQLCHKAQSVSQ 
INHKLEAQLVDLKS KLTETQAEKWLBKEVHDQLLQLHS IQLQL 
HAKTG QS AD S GT I KAKLSGP S VEE LE RELKAN 


67S0 


3 


428 


S CE S RR PG AKWVWASG AL PRDTTGLGSEQ PSG D VAQ SNRATMGT 
TAPGPIHLLELCDQKLMEFLCNMDNKDLVWLEEIQEEAERMFTR 
EFSKEPELMPKTPSQKNRRKKRRISYVQDENRDPIRRRLSRRKS 
RSSQLSSRR 


.6751 


152 


1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCliQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEGYNVCIFAYGQTGAGKSYTMMGKQEKDQQGIIPQLCEDL 
FSR I NDTTNDNMS YS VE VS YME I Y CE R VRDLLNP KNKGNLRVRE 
HPLLGPYVEDLS KLAVTS YND I QDLMDSGNKARTVAATNMNE TS 
SRSHAVFNIIFTQKRHDAETNITTEKVSKISLVDLAGSERADST 
GAKGTRLKEGANINKSLTTLGKVISALAEKDSGPNKNKKKKKTD 
F I P YRDS VLTW LLR ENLGGNS RTAMVAALS PAD I N YDETLS TLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDLLYAQGLGD 
ITDMTNALVGMSPSSSLSALSSRNV 


6752 


24 


1834 


RNCVP PLGC YRSRVKFHSD I KMQ YSHHCEHLLERLNKQREAGFL * 

CDCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 

VKADGFQKLLEFI YTGTLNLDSWNVKEIHQAADYLKVEEVVTKC 

KI KMEDFAFIANPSSTEI SS ITGNI ELNQQTCLLTLRDYNWREK 

SEVSTDLIQANPKQGALAKKSSQTKKKKKAFNSPKTGQNKTVQY 

PSDILENASVELFLDANKLPTPWEOVAQINDNSELELTSWEN 

TFPAQDIVHTVTVKRKRGKSQPNCALKEHSMSNIASVKSPYEAE 

NSGEELDQRYSKAKPMCNTCGKVFSEASSliRRHMRIHKGVKPYV 

CHLCG KAFTQCNQIiKTHVRTHTG EKP YKCELCDKG FAQKCQL VF 

HSRMHHGEEKPYKCDVCNLQFATSSNLKIHARKHSGEKPYVCDR 

CGQRFAQASTLTYHVRRHTGEKPYVCDTCGKAFAVSSSLITHSR 

KHTGEKPFICEIiCGNSYTDIKNLKKHKTKVHSGADKTLDSSAED 

HTLSEQDSIQKSPLSETMDVKPSDMTLPLALPLGTEDHHMLLPV 

TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


2 


1305 


VPSLPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEMEGVALKHGPSLPQERKQAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSQTPAPEHDKAAWKMPLAQKPALAPKPTSQTPPAS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
fco first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L»= Leucine, M=Methionine N=AsMraai'n«. 
P=Proline, Q«Glutamine, R^Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion 
\=possible nucleotide insertion) 








PLSKLSRPYLVELLSRRAGRPDPEPSEPSKEDQESSDRRPPSPP"- 

GPBERKGQKRDEEEEATERKPASPPIiPATQQEKPSQTPEAGRKE 

KPMLQSRHSLDGSKLTEKVETAQPLWITLALQKQKGFREQQATR 

EERKQAREAKQAEKLSKENVSVSVQPGSSSVSRAGSLHKSTALP 

EEKRPETAVSRLERREQLFCKANTLPTSVTVEISYSSPAAPLVKE 

VSKRFSSPDDAPVSSEPAWLALAKRKAKAWSDCPLIIK 


6754 
6755 


2 


413 


FVRRRRRRLGG PEVNTMQ^T.HK^PTIinFnnuT lectio t t r-T^y r^T-. — ' 
«»»wi\«vvrijvH i i »oounjvoKXMuruuvijKEPSlAIjEKJC«RE 

LS PSGI PCEGGLRCLCWKILLNYLPLERASWTS ILAKQRELYAQ 

FLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 


6756 


298 


1341 


PGLOLOVALEADWFLDMPGGRRGPSRQQIjSRSALPSLQTLVGGG 

CGNGTGLRNRNGSATf3T.PVDD7TaT TTDrmffllinnrnn. 

>.u«un3ui«vRHooHHjijfvFi'i lAliXl PGPVRHCQIPDLPVDGS 
LLFE FLF F I YLL VALFI Q Y I Nl YKTVWW YPYNHP AS CTS LNFHL 
IDYHLAAFITVMLARRLVWALISEATKAGAASMIHYMVLISARL 
VLLTLCX5WVLCWTLVNLFRSHSVLNLLFLGYPFGVYVPLCCFHQ 

DSRAHIiljLTDYNYW/fVWTriiVTrPCTi cm/nrr hvrtmnr «**<.. 

wjwiiiuuuiujH i- v vyn£,AviiabAij -I vtjvjLAKSKDFLSLTjLESL 

KEQFNNATPIPTHSCPLS PDLIRNEVECLKADFNHRIKEVLFNS 
LPS AY YVAFL PLCFVK VS G Y LTFMCFLDLCVNY INWVFLV 


6757 


180 


754 


IKRALGSLPl^lPVSWGSLRTLKYQQQPEiRPKVLLCQTRVQOiD 
LRSLQPQPPGLKQS FCLRVLGLQTGATTPGLRDLTCKELI I I/TE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 
KAli YWD VMLEN YRNLVFLG KDNFAJuE VKI CPRVF LY FLCCLS WE 
PFHYLTETEALLTHK 


6758 


2 


459 


noiwij/it »^^^oUv^^U/VrlKKi^bSwWWLATvCMLLFSHLSAVQ 
TRGIKHRIKWNRKALPSTAQITEAQVAENRPGAFIKQGRKLDID 
FGAEGNRYYFJiNYWQFPDGIHYNGCSEANVTKEAFVTGCINATQ 
AANQGS FQ KPDNXLHQQVL W 




1 


1008 


WUI vjiA.ii.i- ^^^^^rrfijir«Jt4jijKL> ViiAV WV5L»SAIjGPGSFCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWpppppps 
L P PS FRRNMANNS P ALTGNS Q PQHQAAAAAAQQQQQ CGGGGATK 
PAVSGKQGNVLPLWGNEKTMNT^NTPMTT.TNTT OQnvctrtr/"\T vnr yj. 

TYHEWDEIYFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGGI 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
QPPTDLWDWFESFLDDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRIPVPVQKNIDQQIKTRPRKI 


6759 


1 


513 


RIWNFHSLDGTSTRAFHPQTGLPLLSSPVPQRKTQSGCFbLDSS 
LLHLKSFSSRSPRPCIiNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEESVLNYRFDPI*G I VDG FTAEVGASGAFCPTHLTLPVEVS FY 
SVSDDNAPSPYMGVITLESLGKRGYRVPPSGTIQWCVL 


6760 
67*1 


239 


606 


VLSKKKGLSAEEKRTRMMEIFSETKDVFQLKDLEKIAPKEKGIT 
AMSVKEVLQSLVDDGMVDCER1GTSNYYWAFPSKALHARKHKLE 
VLESQLSEGSQKHA3LQKSIEKAKIGRCETEERT 




29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS 
S S S S VQRCELS L FQS LHTMTS K K L VNS VAGCADDALAGL VACN P 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTSPAVGS ILAAIRAVAQAGTVGTLLI VKNYTGD 
RLN FG LAREQARAEG I P VEMWIGDDS AFTVLKKAGRRG LCGT V 
LIHKVAGALAEAGVGLEE IAKQVNWTKAKGTLGVSLSSCSVPG 
S K PTFELS ADEVELGLG I HGEAG VR R I KMATAD E I VKLMLDHMT 
NTTNASHVPVQPGSSVVMMVNNIiGGLSFLELGIIADATVRSLEG 
RGVKIARALVGTFMSALEMPGISLTL.LLVDEPLLKLIDAETTAA 
AWPNVAAVSITGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
BRVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PPPASPAQLLSKLSVLLLEKMGGSSGALYGLFLTAAAQPLKAKT 
SLPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide " 
(A= Alanine, C*Cyeteine, D=Aspartic Acid, E« 
Glutamic Acid. P»Phenvlalanine G-nivrin» 
H«Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glut amine, R=Arginine, 
S*Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGABARRPVPVAGBRAGGGAMWFMYLLSMLSLFI 
QVAF I TUVVAAGLYYLAELIEB YTVATSRI I KYMTWF q T2^rr ti~ 

LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LWVNHYLAFQFFABEYYPFSBVLAYFTFCLWIIPFAFFVSLSA 
G HNVL PSTMQPGDD WS N Y FTKG KRG K 


6763 


2 


760 


5GPDFPGRR FRGCCC VRPPAGAGME LGGHWDMNSAPRLVS ETAE 
RKQEQKTGTEAEAADSGAVGARRPIiT.rT.VTi^npT nr crueMinm 
LhS LH VKS LGAS PTVAG I VGS S YG I LQLFS STLVGCWS D WGRR 
SS LLACI LLSALGYLLI^SAATNVFL FVLAR VPAG I FKHTLS I SK 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYI.TELEDGF 
YLTAF ICFLVFI LNAGLVW FFPRREAKPGSTE 


6764 


80 


438 


liKKMDTMMLSVRNLFEQLVRRVEILSEGNEVQFZQLAKDFEDFR 
KKWQRTDHEIiG KYKDLLMKAETERSALDVKLKHARNQVDVEI KR 
RQR AE ADCE KLE RQ I QL I REMLMCDTSGS I Q 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL" 
RGSVLVSEALSGSAKDGIVTEVAVGVKRGSDEIiLSGSVLSSPNS 
mi ioouv v i./u>(kJi»uoi\.i\r MjbUKMUGAPSRVLHIRKIiPGEVTETE ! 
V I ALG LPFGKVTN I LML KG KNQAFLELATE EAAI TNGNY YS AVT • 
PHLRNQ * | 




1 


1287 


EGGSFKASLTWLWPLGEMKLHCEVEVISRHLPALGLRNRGKGVR ' 
AVLSliCQQTSRSQPPVRAFLLISTLKDKRGTRYELRBNIEQFFT 
^ vt/ooivrtA vKu.Mii'i j vuiui^iUU^oSSIiKGFLSAMRIiAHRGCN i 
VDTPVSTLTPVKTSEFENFKTKMVITSKKDYPLSKNFPYSLEHL > 
QTSYCGLVRVDMRMLCLKSLRKLDLSHNHIKKLPATIGDLIHLQ 
E LN LN DNHLE S FS VALCHSTLQKS L WS LDLS KNK I KALP VQ FCQ 
LQELKNLKLDDNELIQFPCKIGQLINLRFLSAARNKLPFLPSEF 
RNLSLEYLDLFGNTFEQPKVLPVIKLQAPLTLLESSARTILHNR 
IPYGSHlIPFHLCQDLDTAKICVCGRFCIiNSFIQGTTTMNLHSV 
AHTWLVDNLGGTEAP 1 1 S YFCSLGCYVNSSDI 


6767 


336 


919 


APM I CLCS SDLQ FR YKEAFLRDRGLQ IG YCS VDDD PRMKH FLNV 
GRLQSDNE YKKD FAKS RSQFHSSTDQ PGLLQ AKRS QQ LAS DVH Y 
RQPLPQPTCDPEQLGLRHAQXAHQLQSDVKYKSDLNLTRGVGWT 
PPGSYKVEMARRAAELANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATEILHVKKKKALLL 


6758 . 
6769 


2 
284 


363 
396 


PGSTISCYLLSEGSLPLCMQVACGBEKHRAP-mKTLRARFKkTE"* 
LRLSPTDLGSCPPCGPCPIPKPAARGRRQSQDWGKSDERLLQAV 
ENNDA P R VAALI AR KGLVP T KLDPEG KS AFHL 


6770 


1 


397 


MSTPDFSTAENNQEUANEVSCLKAMLTLMLQAMGQAD 
URNYQVIWSSTMAKLHDYYKDEWKECLMtfeFNy^SVMQVPRVEK 
ITLNMGVGEAIADKKLLDNAAADLAAISGQKPLITKARKSVAGF 
KIRQGYPIGCXVTLRGERMWEFFERLITrAVPRIRDFRGLSAKS 


6771 


3 


378 


APAGTLAMTGKSVKDVDRYQAVLANLLLEEDNKFCADCQSKGPR 
WASWNIGVFICIRCAGIHRNLGVHISRVKSVNIiDQMTQEQlQCM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6772 


1 


1400 


AAAFLQGMTVJNGFINTV1TSL\ERRYDLHSYQSGLIASSYD1AA " 

CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 

P**GWKLDAGVRTCPANPR\PVCAG\HTSGLSRYQLVFMLGQFL 

HGVGATPLYTLGVTYLDENVKSSCSPIYIAIFYTAAILGPAAGY 

LIGGALLNIYTEMGRRTELTTESPLWVGAWWVGFLGSGAAAFFT 

AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGKT 

IRDLPLSIWLLLKNPTFILLCIiAGATEATLITGMSTFSPKFIiES 

QFSLSASEAATLFGYLWPAGGGGTFLGGFFVNKLRLRGSAVIK 

FCLFCTWS LLG I L VFSLHCP S VPMAG VTAS YGG S LL PEGHLNL 

TAPCNAACSCQPEHYSPVCGSDGLMYFSLCHAGCPAATETNVDG 

QKVYRDCS C I PQNLSSGPGHATAGKCTST 
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SEQ 
ID 

NO: 


~ Predicted 
beginning 
nucleotide 
iucd i. ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MeMethionine, N«Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serin e , T=Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6773 


1 


630 


PWEAPKEHKYKAEEHTWLTVTGEPCHFPFQrHRQLYHKCTHKG" 
RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRT 
EQAAVARCQCKG PDAHCQRLASQACRTN PCLHGGRCLE VEGHRL 
CHCPVGYTG?FCDVGE*GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQY?LFFILSS/ l WVPTFIjSMDVDGRVIKADSFSKilSS " 
GLRIGFLTGPKPLIERVILK1QVSTLHPSTFNQLMISQ 


6775 


104 


614 


TCPSQLR VLTARGGRRAPSPQLWTLVLALI EE KWRSHR I LRMNS 
GR P ETMENL PALYT I FQGEVAMVTDYG A F I KI PGCRKQGLVHRT 
HMS S CR VD KPSE I VDVGDKVWVXL IGR EM KNDR I K VS LS MKWN 
QGTG KDLD PNNV \ S LS KKRGGG D PSR I TLG RRS PLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRlSIPLUSNMRPEKCRRFVHPQWQLIilT" 
LNGTFPKTSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
TSVAKFVFMAGMMVGGILGGHLSDRFGRRFVLRWCYLQVAIVGT 
CAALAP-lTLIYCSLRFIiSGIAAMSLITNTIMLIAEWATHRFQ^I 
G I TLGMCPSG IAFMTLAGLAFAI RDWH I LQL WS VPY FVI FLTS 

SWLLESARWLIINNKPEEGLKELRKAAHRSGMKNARDTLTLEIL 

KSTMKKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFKA 

YFGLNbHG/LKHLGNNVFLLQTLFGAV/TPPGQLVLHLGHWGSG 
RVSSRGRVNCLGLFVLQVW 


1 6777 


779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPLYACGSY 
GRS LGLY AWDDGS PLAliLGGHQGG I THLCFHP DGNR FFSG AR KD 
AELLCWDLRQSGYPLWSLGREVTTNQRiyFDLDPTGQFLVSGST 
SG AVS VWDTDG PGNDG KPE P VLS FL PQKDCTNG VSLHPS LPLLG 

HCLPVSVCFLSPTESGGRRRGAGPSLGSPRRHVHLECRLQLWWC 
GGGARLQHP ♦ * S PRARKGR 


6778 


311 


805 


IQSITDESRGSIRRKNPANTRLRIiNVP\EBTAGDSE/ERSPEEE 
VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


6779 


2 


535 


RALRRQ PRLLAANG I E PES MAI SEPIKGSRKP CVNKE E LALKKP 
MAKCAWKGPREPPQDARAEAES PGGASESDQDGGHESPPKKKAV 
AW VS AKNPAPMRKKKKVS LGP VS YVL VD S EDGR K KPVM P KKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 


6780 
6731 ■ 


3 


403 


HEVNDNKPE1N1NLKSPGKEEISYI7EGDPIDTFVALVR"VqdKD 
SGLNGEIVCKLHGHGHFKLQKTYENNYLILTNATLDREKRSEYS 

ltviaedrgtpslstvkhftvqindindnpphfqrsryefvise 




1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPEbSEVS 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT 
SAALPTHLQSALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 

TPVFINSSSIIQVMKPSQPSTIPAAPLTTNSGLMPPSVAWGPL 
HIPQNIKFSSAPVPPNALSSSPAPNTOTRP'DT.VT OCDATnvnr ™ 

SPPCTSSPWPSHPPVQQVKELNPDEASPQVNTSADQNTLPSSQ 
STTMVSPLLTNSPGSSGNRRSPVSSSKGKGKVDKIGQILLTKAC 
KKVTGSLEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTPTPPAPTLLKMTSS P VG PGTAS AG PSLPGGALPTS VRS I VTT 
LVPSELISAVPTTKSNHGGIASESLAG 


6782 


3 


1327 - ' 


KKPTVIRIPAXPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVE RTRNLESNHPGQTGG FVRVPPR LP PRP VNG KT I PTQQ P PTK 

VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYMLSVPHGIANEDIVSQ 
NPGELSCKRGDVLVMLKQTENNYLECQKGEDTGRVHLSQMKLIT 
PltDEHLRSRPNPFSPPKAPSHAQKPVDSGAPHAWLHDFPAEOV 
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Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid # F«Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
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Codon, /-possible nucleotide deletion, 
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DDLNLTSGEIVYLLBKIDTDWYRGNCRNQIGIFPANyVKVIIDI 
PEGGNG KREC VS S HCV KGS RCVAR PEY I G EQ KDE LS FSEGE 1 1 1 
LKEYVNEEWARGE VRGRTG I FPLNFVE PVEDYPTSGANVLSTKV 

PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


SYHHHHAQQSAAASPNLTASQKTVTTTSMITTKTbPLVIjKAATA 
TM PAS VVGQRPT I AMVTA INSQKAVLS TDVQNT P VNLQTS S KVT 
GPGAEAVQI VAKNTVTLQVQATPPQPI KVPQFI PPPRLTPRPNF 
LPOVRPKPVAQNNIPIAPAPPPMLAAP0LIQRPVMLTKFTPTTL 
PTSQNS I H P VRWNGQTAT I AKTFPMAQLTS I VIATPGTRLAGP 
QTVQLSKPSLEKQWKSKTETDEKQTESRTITPPAAPKPKREEN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERKRRTTANPVYSGAVFE 
PE RKKS AVTYLNS TMHPGTR KRGR P P KYNAVLG FG ALTP TS PQS 
SHPDS PENEKTETTFTFPAPVQPVS LPS PTSTDGDI HED FCS VC 
RKSGQIiLMCDTCSRVYHLDCLDPPLKTIPKGMWICPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS1SKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
G IDLS KP VDSEATVGAISNG PDCTP PANAATSTPAPS PSSQS CT 
ANCNQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAAS PNLTASQKTVTTTSMITTKTLPLVLKAATA 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPVNLOTSSKVT 
GPGAEAVQI VAKNTVTIjQVQATP PQ P I KVPQF I P P PRLT PR PN F 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS IHPVRWNGQTATI AKTFPMAQLTS I VIATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPXPKREEN 
PQKLAFMVSLGLVTHDHLEEIQSKRQERXRRTTANPVYSGAVPE 
PERKKSAVTYLNSTMHPGTRKRGRP PKYNAVLGFGALTPTS PQS 
SHPDS PENEKTETTFTFPAPVQPVS LPS PTSTDGD I HEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKTIPKGMWICPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEEEKQKLLKWSSDLKQERBQ 
LEQKVKQLSNSISKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
G IDLS KP VDSEATVGAI SNG PDCTP PANAATSTPAP S PS SQS CT 
ANCNQGEETK 


6785 


X 


528 


LGNTVLHYCSMYSKPECLKLLLRSKPTVDIVNQAGETALDIAKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RS PKVLVLAPTRELANHVSRDFKDI \TRKLTVARFYGGTSYQSQ' 

INHIRNGIDILVGTPGRIKDHLQSGRLDLSKLRHWLDEVDQML 

DLGFAEQVEDI IHES YKTDS EDNPQTLLFSATCPQWVYTVA\ KK 

YMKSRYEQVDLDGKMTQKAATTVEHLAIQCHWSQRPAVIQDVLQ 

VYSGSEGRAIIFCETKKNVTEMAMNPHIKQNAQCLHGDIAQSQR 

E I TLKGFREGS FKVL VATNVAARGLD I PEVDLVIQS S PPQDVES 

Y I HRSGRTGRAGRTG I CI CF YQPRERGQLR YVEQKAGITFKRVG 

VPSTMDLVKSKSMDAIRSLASVSYAAVDFFRPSAQRLIEEKGAV 

DALAAALAH I SGASS FEPRSL ITSDKGFVTMTLESLEE IQDVSC 

AWKELNRKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 

WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 

RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 

FD * VF YHL VD FLS D FL VDS VYLTGRQ I DHLTGLTGL I DH LTS HS 

SVWN 


6787 


2646 


2270 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFF 
FELGSPSGVISAHCNLRLLGSSDSPAPASRVAGIIGTCHHAWLI 
LVFLVEMGFHHVGQAGLKLLTL\VIHPPWPPKVLGLQT 


6788 


16 


"' 936 


GGT VDLR \ DMLAVS VLAA VRGGR/ ATVR RVRESNVLHE KS KG KT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPS VQ INTE EHVD\ ELDQ 
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Codon, /^possible nucleotide deletion, 
_ \~po38ible nucleotide insertion) 


6789 






EVI LWGS * DS *G YPKGK* LLPKEVPS R/ RVLLSGLTPr naTnpV 

FTEDLS K\ YVTTMVCVAVNGKPMLG VI HKP FSE YTAWAMVDGGS 

NVKARSSYNEKTPRIVVSRSHSGMVKQVALQTFGNQTTIIPAGG 

AGYKVIoALLDVPDKSQEKADLYIHVTYIKXWDICAGNAILKALG 

GHMTTLSGEEISYTGSDGIEGGLLASIRMNHQALVRKLPDLEKT 
GHK 


6790 


2 


678 


GNG I NVLK 1 AP ESAI KFMA YEQ I KRI*V W * *PGDS*GF/YERLVA 
GSLAGAIAQSSIYPMEVLKTRMALRKTGQYSGMLDCARRILARE 
GVAAFYKGYVPNMLGIIPYAGIDLAVYETLKNAWLQHYAVNSAD 
PGVFVLLACGTMSSTCGQLASYPLALVRTRMQAQASIEGAPEVT 
M^FKHILRTEGAFGLYRGIJU>NFMKVIPAVSISYVVYENLKI 


6791 


2 


4068 


APPAGRRRMQAAPRAGCGAALLLWIVSSCLCRAWTAPSTSQKCD 
EP LVSGLPHVAFSS SS S I SGS YS PG YAKI NKRGGAGGWS PS DSD 

HYOWLQVDFGNRKQISAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KPYHQDGNIWAFPGNINSDGWRHELQHPIIARYVRIVPLDWNG 
EGRIGLRIEVYGCSYWADVINFDGHWLPYRFRNKKMKTLKDVI 
ALNFKTSESEGVILHGEGQQGDYITLBLKKAKLVLSLNLGSNQL 
GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMQHFR 
TNGEFDYLDLDYEITFGGIPFSGKPSSSSRKNFKGCMESINYNG 
VN I TDLARRKKLEP SNVGNLS FS C VE P YTVP VF FNATS YLEV PG 
RLNQDL FS VS FQ FRTWNPNGLL VFSH FADNLGNVE I DLTES KVG 
VHINITOTKMSQIDISSGSGLNDGQWHEVRFLAKENFAILTIDG 
D E AS A VRTNS PLQ VKTGE KY FFGG FLNQMNNS SHS VLQPS FQGC 

MQLIQVDDQLVNLYEVAQRKPGSFANVSIDMCAIIDRCVPNHCE 

HGGKCSQTWDSFKCTCDETGYSGATCHNSIYEPSCEAYKHLGQT 

SNYYWIDPDGSGPLGPLKVYCNMTEDKVW7IVSHDLQMQTPWG 

YNPEKYSVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 

NTPK5SPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 

YYCNCDADYKQWRKDAGFLSYKDHLPVSQVWGDTDRQGSEAKL 

SVGPLRCOGDRNYWNAASFPNPSSYLHFSTFQGETSADISFYFK 
TLTPWGVFLENMGKEDFIKr.RT.^t; ATrvoB-c irr»T7/^Mr> T>trr»-r-t 

ivjivQiyx- J.x\.uoiji\.3>\i t, vorofc LJVCaNGPVEIVVR 

S PT PLNDDQWHR VTAERNVKQAS LQ VDRLPQQ I R KA P TEGHTRL 
ELYSQLFVGGAGGQQGFLGCIRSLRMNGVTLDLEERAKVTSGFI 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCNKD 
VGA FFEEG MWLR YNFQAPATNARDS S SR VDNAPDQQNS H PDLAQ 
EEIRFSFSTTKAPCIIiLYISSFTTDFTiAVT.VKPTrcT nTtovxrr ^ 
G TREP YNID VDHRNMANGQPHS VN I TRHEKT r FLKLDHYP S VS Y 
HLPSSSDTIiFNSPKSLFIiGKVIETGKIDQEIHKYNTPGFTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESNCGASPLTLSPM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAI IGGVI 
A\WIFTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRPIDESKKEWPHLRGGYLAMG 




1801 


1193 /" 


TGHEGAKGEKtiUKGDLGPRGERGQHGPKGEKGYPGIPPELTPGW 
SAW* S WLTAAST KVQAILL PQP LE * LG LQ I AFMAS LATH FSNQ 

NSGIIFSSVETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
EEVYVYLMHNGNTVFSMYSYEMKGKSDTSSNHAVLKLAKGDEVW 
LRMGNGALHGDHQRFSTFAGFLLFETK 


6792 


33 


1073 T 


VRHTNWGVDMY^FSLGSESPKGAIGHIVSTEKTILAVERKKVLL 
PPLWNRTFSWGFDDFSCCLGSYGSDKVLMTFENLAAKGRCLCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRLRQALYGHTQAV 
TCLAASVTFSLLVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 
ITISDVSGTIVSCAGAHLSLWNVNGQPLASITTAWGPEGAITCC 
CLMEGPAWDTSQI IITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
\PGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
WSRKHTKLLVGDERGRI FCWSADG* EERGSRGSQTTVPG 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H«Histidine, 1=1 s ©leucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6793 


2340 


805 


ViKWiiWl \iG5IiTQAGTVSlX3LDABGQEVFVPFSAVIiPP1VAPND 
LVFIX3WDISSliNLAEAMRRAKVi»DWGLQEQLWPHMEALRPRPSV 
Y I PEFI AANQSARADNLI PGSRAQQLEQIRRDIRDFRSSAGLDK 
VI VLWTANTBRFCEVI PGLNDTAENLLRT IELGLE VS PSTLFAV 
AS I LEGCAFLNGS PQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKSVLVDFLIGSGLKTMSIVSYNHLGNNDGENLSAPLQFRSKEV 
SKSNWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRAUJE 
YTS ELMLGGTNTLVLHNTCEDSLLAAPIMLDLAliIiTELCQRVS F 
CTDMDPEPQTFHPVLSLLSFLFKAPLVPPGSPWNALFRQRSCI 
ENI LRACVGLP PQNHMLLEHKMERPGPSLKRVGPVAATY PMLNK 
KGPVPAATNGCTGDANGHLQEBP PM PTT* GPGHTVSRLFLPAAP 
HDPTLKAPTNKGRCHFS P PSTWGSWGL 


6794 


169 


1349 


DDVKRKPEASAH*EKPGPPSRPGVRGGRERAGGRGSHGARSCR\ 
EPAPPAPAPPEDHPDEEMGFTIDIKSFLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRLFERYGEPSEVFINRDRGFGFIRLESRTLAE 
IAKAELDGTILKSRPLRIRFATHGAALTVKNLSPVVSNELLEQA 
FSQFGPVEKAWWDDRGRATGKGFVEFAAKPPARKALERCGDG 
AFLLTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
R FAQ PGTFE FEYAS RWKALDEMEKQQREQVDRNI REAKE KLEAE 
MEAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQIQLR 
HEEEHRRREEEMIRHREQEELRRQQEGFKPNYMENYVCHFLR 


6795 


1746 


1010 


G PRRQTQVRD1 1 ELDS F* DWAAQETDCAQNSGERL * KGV/LENFS 
TMSKSAVKISLDLLSNPLCEQDQDLLNMVTALDTAMKRMDAFNQ 
EKVNQ I QKT VI E PLFCKFGS VFPSLNMAVKRREQALQDYRRLQAK 
VEKYEEKEKTGPVLAKLHQAREELRPVREDFEAKNRQLLEEMPR 
FYGSRLD Y FQPSFES LIRAQWYYS EMHKI FGDLSHQLDQPGHS 
DEQRERENEAKLSELRALS I VADD 


6796 


48 


683 


GKE IQI PTI KLAWLLFGLE * P VGALGKG WS F+ * S HVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA*SSPSPVGACPSLNPPET 
SVQEGRDCWQR*LPRLFSALVGQPGCWPQGAPPERCV*PGRCKW 
HLQSQVLR* ERRRCCRCLPRFA*GWRRRHQRLGLG IHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRPSSMLWTSTWRCLTCHWAGMRMSWGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTLIRWWLLTASRVDPP 
ERPPPPPSDDLTLLESSSSYKNb/DAQIPQ/DWSMSPSTSG*RP 
LTS RAS S IMRS RTAI PSAS * S RL TTKHTVGG S PS AWR PR PTSRS 
VSTPVSSSTETTASGSCLTWWSSSPAPCPSSSAPAHSFEASCCK 
TSLWGS CGGSGDGSS ACGSGWNLSMAGTSCS S PAMCS PSRAPS * 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 
PHCGWPCPASCASAAAWLSSTWATAS VAGSCWG P I M + SS AHS P W 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCGSSPSSTFTPSS 
ASSSTWCS AS SSRSS PAPTTPS S I PAAQAQRRAS CRPTSHSART 
APP PAS S AAGAARPAAFSAAAEGTPRRS I RCW 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYI I G FCDQ I N KELEG * VS " 
ALWGQLRGSGIiGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLIiAHKIQSPQEWE ALQALTYLGDRVS EKVKTKV I ELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPP IPVDRTLI PSPPPRPKNP 
VFDDEEKS KLLAKLLKS KNPDDLQEANKL I KSMVREDEARI QKV 
TKRLHTLE E VNNNVR LLS EMLLH YSQEDS S DGDR ELMKELFDQC 
ENKRRTLFKIiASETEDNDNSLGD I LQAS DNLSRVINS YKT I IEG 
OVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
S WLD E E LLCLGLAD PA PNVP PKES AGNSQWHLLQREQS DLD F FS 
PRPGl'AACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFS TGVAPALAPKVE PAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
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Amino acid segment containing signal peptide " 
(A=Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=»Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RILFHPAKECPPGRPDVLVWVSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLLANPLKEKVRLRYK 
LT F ALG EQLSTE VGE VDQ F P P VEQWGJJL 


6799 


3894 


1696 


STISWBSLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLfiTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTI1EG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDIiAELDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLbQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APS AGS SL FSTG VAPALiAPKVE P A VPGHHGLALGNS A LH H tiDAL 
DQLLEEAKVTSGLVKPTTSPLI PTTTPARPLLPFSTGPGSPLFQ 
PLSFQSQGSPPKGPELSLASIHVPLESIKPSSALPVTAYDKNGF 
RI LFHFAKECPPGRPDVLWWSMLNTAPLPVKS I VLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLIiANPLKEKVRLRYK 
LTFALGEQLS TEVGEVDQFPP VEQWGNL 


6800 


404 


1646 


RRSPSTGLSPVP0PSSPSLSDYSIPWSLLLSGT1AWATPGK*AG 
* PQAW * LGLAPAI AF I /GLTRGRKQNKEKMAEGGSGDVDDAGDC 
S G AR YND WS DDDDDS NES KS I VW Y P PWAR IGT EAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMS EKPY I LE 
AAL I ALGNNAAYAFNRDI IRDLGGLP IVAKI LNTRDP I VKEKAL 
I VLNNLS VNAENQRRLKVYMNQVCDDTITSR LNSSVQLAGLRLb 
TNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVIFENINDN 
FKWEENEPTQN0FGEGSLFFFLKEFQVCADKVIX3IESHHDFLVK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESQQASVTMHDVDAESFEVLVDYCYlX3RVSLSEAN\7EFtiT" 
YAASDMLQLEYVREACASFLARRLDLTNCTAILKFADAFGHRKL 
RSQAQS Y 1 AQNFKQLS HMGS I REETLADLTLAQLLAVLRLDSLD 
VESEQTVaiVAVQWLEAAPKERGPSAAEVFKCVRWMHFTEEDQD 
YLEGLLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/R*QQQLSCICSRKSTPETGYVCQGDGDLLWTPQRSLS\RYDPY 
S GD I YTM PS PLTS FAHTKTVTS S AVCVS PDHD I YLAAQ PRKDLW 
VYKPAQNS WQQLADRLLCREGMDVAYLNGYI YILGGRDPITGVK 
LKE VEC YS VQRNQWAL VAP VPHS FYS FEL I WQNYL YAVNS KRM 
LCYDPSKNMWLNCASLKRSDFQEACVFNDE I YCICDI P VMKVYN 
PARGEWRR I SN I PLDS ETHNYQ I VNHDQKLLL I TSTTPQW KKNR 
VTVYE YDTRE DQW INI GTMLGLLQFDSG FI CLCAR VYP S CLE PG 
QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 
VAPQRNAQDQQGSL 


6602 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE 
PSTRKNLMNS LEQ KIRCLE KQR KE LLEVNQQ WDQQFRS MK EL YE 
RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREEKEKERLNEELHELKEENKLLKGKNTLANKEKEHYEC 
EIKRLNKALQDALNIKCSFSBDCLRKSRVEFCHEEMRTEMEVLK 
QQVQ I YEEDFKKERSDRERLNQE KE ELQQ I N ETSQS QLNR LNSQ 
IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCLAPPPVCCQAG/PR 
TPGLK+ S S CLWLP KC *NFRFILSKES PS VE VHTNRERQQATRER 
G 


6803 . 


1 


2203 


KLSGRPYRHMGVLGTSKLYDIRKTIFTFTPQFIDQQQFYLALDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firsc 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid. B= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KiOIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTSLNS 
S I LAALR KMQDG YFGGARVQTG KLS E FLTTS CCTHLS FM DPG PE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKIAPTSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYLPTKLFQASRPSFNLLDSPHPRQENQVPSVRVEIHLPRD 
QSGEVDFKALVLQLKETSSIX3EQADILY^JIiYTMKGPDWNTELYN 
ERS ATVRELLTEL YGKVGE IRHWGLIRY I SG I LRKKVEALDEAC 
TDLLSHQKHLTVGLPPEPREKTISAPLPYEALTQLIDEASEGDM 
S I S I LTQE I M V YLAM YMRTQ PG LFAEMFRLR IG L 1 1 Q VMATE LA 
HSLRCSAEEATEGLMNLSPSAMKNLLHKILSGKEFGVERK/SVR 
PTDSNVSPAI S IHE I GAVGATKTERTGI MQLKSEI KQVE FRRLS 
I S AE SQS PGTSMT PS SGS F PS A Y DQQS5 KDS RQGQWQRRRRLDG 
ALNRVPVGFYQKVWKVLQKCHGLSVEGFVLPSSTTREMTPGEIK 
FS VHVES\VLNVLIiRPEYRQLIjVEAI LVLTMLADI EIHS IGS 1 1 
AVE K I VH I ANDLFLQE Q KTLGP \DDTMIiAKD P ASG \ 1 CTLR \ YD 
SAPSGRFGTMTYLS \ RAA\ATY VQE FLP\HS I CAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLE E KRKSLRTTG FYS GFS E VAE KR I KLLNNS DERLQ NS RAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQTVAEEESCSPSVELEKPPPVNVDSKPIEEKTVEVNDRKAEFP 
SSGSNFSA*IPLPYLHLNRLHQSL*QKGSRQQSSVTVSEPLAPN 
QEEVRS I KS ETDST I EVDS VAGELQDLQSERE * LASRF * CQCEL 
KQ+ *SARTRTS*KSLYRSEKSERCSGRRKFI KKAEKKP*SNSGK 
QQKEGKRHK 


6805 


1*39 


206 


RQPDLKY FG KS FDVS VS ESSSLLSNDLPKFADG I KARNRNQNYL 
VPSPVLRILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESP 
IEVHTAEDVPIAVEVHAISEDYDIETENNSSESLQDQTDEEPPA 
KLCKILDKSQALNVTAQQKWPLLRANSSGLYKCELCEFNSKYFS 
DLKQHMI LKHKRTDSNVCRVCKE S FSTNMLL I EHAKLHE EDP Y I 
CKYCDYKTV I FENLSQH I ADTHFSDHLYWCEQCDVQFSSSS ELY 
LHFQEHSCDBQYLCQFCEHETNDPEDLHSHWNEHACKLIELSD 
KYNNGEHGQYSLLSKITFDKCKNFFVCQVOGFRSRLHTNVNRHV 
AIEHTKIFPHVCDDCGKGFSSMLE\IAKHLNSHLSEGIYLCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERELISHLP 
VHETT 


6806 


272 


3794 


VALCFPNSDPVMFMDAFYGCLLAELGPVPIEVPLTRKDAGSQQV 
GFLLGSCG VFIiALTTDACQKGLPKAQTGE VAAFKG WP PL S WLV I 
DGKHLAKPPKDWHPLAQDTGTGTAYI E YKTS KEGSTVG VTVSHA 
S LLAQCRALTQACGYS EAETLTNVLDFKRDAGLWHGVLTS VMNR 
MHVVSVPYALMKANPLSWIQKVCFYKARAALVKSRDMHWSLLAQ 
RGQRDVSLSSLRMLIVADGANPWSi SSCDAFLNVFQSRGLRPE V 
ICPCASS PEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVI RVD 
TEEKLS VLTVQDVGQVM PGANVCWKLEGTPYLCKTDEVGE ICV 
SS S ATGTAYYGLLGITKNV FE AVPVTTGGAP I FDRPFTRTGLLG 
F IG PDHLVF1 VG KLDGLM VTG VR RHNADDWATALAVE PM K FVY 
RGR I AVFSVTVLHDDRI VLVAEQRPDASBEDS FQWMS RVLQAI D 
S IHQ VG VYCLALV PANTLP KAPLGG I H I S ETKQR FLEGTLHPCN 
VLMCPHTCVTNLPKPRQKQPEVGPASMIVGNLVAGKRIAQASGR 
ELAHLEDSDQARKFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
STATC VQ LH KRAERVAAALME KGRLS VGDHVALVY PPGVDL I AA 
FYGCL YCGCVPVTVRPPHPQNLGTTLPTVKM I VEVS KSACVLTT 
GAVTRLLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
DVUIYLDFSVSTTGILAGVKMSHAATSALCRSIKLQCELYPSRQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVSQYKARVTFCCYSVMEMCTKGIiQAQTCVLRMKGVNLSCVRTC 
MWAEERP\RIALTQSFSKLFKDLGLPARAVSTTFGCRVNVAIC 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Mi^Methioninp N-lcnpr-anino 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








^TAGPDPTTVYVDMRALRHDRVRLVKKGSPHSLPLMES'GKIL 
PGVKVI IAHTBTKGPLGDSHLGl?IlJV<3QDUMaTnwT\/vr»T?cj\T 

HADHFSARLSFGDTQTIWARTGYLGFLRRTELTDASGGRHDALY 
WGSLDETLELRGMRYHPIDIETS VIRAHRS IAECAVFTWTNLL 
WWELDGLEQDALDLVALVTNWliEEHYliWGWVI VDPGVI P 
INSRGEKQRMHLRDG FLADQLDP I YVAYNM 


6807 


1444 


606 


VGHDTVHAM FTC FPKCLG FS P P VNVT VS PRS EE SHTTT VSGGNG 
. SVFQAGPQLQALANLEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGFCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 
RFPCGMEVHSGQRELESWAVGEAMA\liKFPMGAMSYCLRDRSR 


6B08 


2063 


737 


GVGSGAASAIjARSRPIiASRIjSSRRRTRAPRSGAMQRIiAMDLRML 
SRELSLYLEHQVRVGFFGSGVGLSLILGFSVAYAFYYLSSIAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPFX 
I TS KP PVQYRNEL I KTADGGQ I SLDWFDNDNS TCYMDAS TR PT I 
LLLPGLTGTSKESYILHMIHLSEELGYRCVVFNNRGVAGENLLT 
PRTYCOOTEDLETVIIIHVHSLYPSAPFIiAAGVSMGGMLLLNYZ* 
a vjo r\.i tr ijrLAAA 1 1 a VGWWTr ACSESLEKPLNWLLFNYYLTTC 
LQSSVNKHRKMFVKQVDMDHVMKAKSIREFDKRFTSVMFGYQTI 
DDYYTDASPS PRLKS VG I PVLCLNS VDDVFS PSHAI P I ETAKQN 
PNV7ALVLTS YGGH IG FLEG I W PRQS T YMDR VFKQ FVQAM VEHGH 
ELS 


6809 


939 


65 


UXSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPlAGTQfVPQ 
TDEAAOTDSQPLHPSDPTEKCJQPKRIiHVSNIPFRFRDPDLRQMF 

v ^ A *^viiixrw&KO&itGrXjFVTFETSSDADRAREK^ 
EGRKIEVNNATARVMTNKKTGNPYTNGWKLNPWGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPP I PTYG 
AWYQDGFYG AE I \ LEATQPTDTLS PLQRRQ PTATVTAES TQL P 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6810 


939 


6* 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQrVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 

GOFGKILDVPTT PNPPfSQ Vy3EY2Y7t/»PirCTI«?On*T\t> ItnnrrT . 

uyi v c» J. j. r iNCtivvjO iv.vjC \j£ v 1 1 jCjIooUADRAREKIiNGTIV 
EGRKI E VNNATAR VMTNK KTGNP YTNGWKLNP WGAVYG P E FYA 
VTG FP YPTTGTAVAYRGAHLRGRGRAVYNTFRAAP PPPP I PTYG 
AWYQDGFYGAE I \ LEATQPTDTLS PLQRRQ PTATVTAES TQL P 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1*22 


658 


dlvtvwsfvdcrviasthghVkswvswafdpyttsveegdpme 

FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 

svgqdtqlclwdltedilfphqplsrarthtnvmnatsppagsn 

GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 

tdpaktlgtplcprmbdvplleplickkiahbrltvlifledci 
vtacqeg f 1 ctwgrpg kws fnp 


6812 


4001 


16' 8 2 

* 


EUAV F S LDLS TI I QGTWFLNGE ELKSNEPEGQ VE PGALR YR I EQ 
KGLQHRLIUIAVKHQDSGALVGFSCPGVQDSAALTIQESPVHIL 
SPQDKVSLTFTTSERWLTCELSRVDFPATWYKDGQKVEESELL 
WKMDGRKHRHLPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKVYVAAVRLERWLTCELCRPWAEVRWTKDGE 
E WES PALLLQKEDTVRRLVLPAVQLEDSGBYLCE I DDESAS FT 
VTVTEPPVRIIYPRDEVTLIAVTLECVVLMCELSREDAPVRWYK 
DGLEVEESEALVLERDGPRCRLVLPAAQPEDGGEFVCDAGDDSA 
FFTVTVTEPPVQFI^ETTPSPIiCVAPGEPVVLSCELSRAGAPV 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

aiUJLuO dCia 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=K3lycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








VWSHNGRPVQEG5GLBLHAEGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPS LS FTVQVAE P P VRWAPEAAQTR VRS TPGGDLE LWH 
libGPGGP VR w i JUJGJbRIiASQGH VQLEQAGARQVLRVQGARSGDA 
GEYLCDAPQDSRI FLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEV S P PDADVTWLRNGAVVTPG PQRQS CCS YGGCRMCG QRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVS Y WPTRAPTAARATT IAPWPGSA 


6813 


9 


836 


S S TQQR PG V P AGPRPLDG Y LGVADH K PLKMHCRDCAL VT S S GH L 
LHSRQGSQIDQTECVIRMNDAPTRGYGRDVGNRTSLRVIAHSSI 
QRI LR NRHDLLNVSCX3TVFI FWG PSS YMRRDG KGQVYNNLHLLS 
QVLPR LKAFMI TRHKMLQ FDE LFKQETGQ\NRKI S NTWLS TGW F 
TMT I ALELCDR INVYGMGP PDFCRDPNHPS VP YHY YEPFG PDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQ EAN/ AR ERNRMHGLND ALDNLR KVV PC Y S KTQ KLS K I ET 
LRLAKNYIWALSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQ LNARS FLMGQGGEAAHHTRS P YST FY P P YHS PELTTP PGHG 
TLDNSKSMKPYNYCSAYES FYESTS PE CAS PQFEGPLSPP PINY 
NGIFSLKQEETLDYGKNYNYGMHYCAVPPRGPLGQGAMFRIiPTD 
SH PP YDLHLRSQSLTMQDELNAVFHN 


6615 


906 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGS FGKLRDALRRSS EMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 


1 


803 


NLLKTHKF\LLGQDEDSLHS VPVAQMgnVqE YLKTLAS PLREI D 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
S PMS S KRRR SMS LLLRK PQTP PTVTNHVGGKG P PS AS WF PS YPN 
LIKPTLVHTDATIIHDGHEEKMENGQITPDGFLSKSAPSELINM 
TGDLMPPNQVDSLSDDFTSLSKDGIjIQKPGSNAFVGGAKNCSLS 
VDDQKDPVASTLGAMPNTLQITPAMAQGINADI KHQLMKEVRKF 
GRSK 


6817 


172 


3457 


LGMMDSPKIGNGLPVXGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DE YCPACKERG KLKALKTYRI S FQES I FLCEDLQCI YPLGS KSL 
NNLISPDLEECHTPHKPQKRKSLESSYKDSLLLANSKKTRNYIA 
IDGGKVLNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLERNEI 
LEADT VDMATTKDPATVD VS GTGR PS PQNEG CTS KLEMPLES KC 
TS FPQALC VQWKNAY ALCWLDC I LS ALVHS E ELKNTVTG LCS KE 
ES I FWRLLTKYNQANTLLYTSQLSGVKDGDCKKLTS E I FAE I ET 
CLNEVRDE I FISLQPQLRCTLGDMES PVFAFPLLLKLETH I EKL 
FLYSFS WDFECSQCGHQYQNRHMKSLVTFTNVI PEWHPLNAAHF 
GPCNNCNS KSQI RKM VLEKVS P I FMLHFVEGLPQNDLQHYAFHF 
EGCLYQITSVI Q YRANNHF ITWILDADGSWLBCDDLKGPCS ERH 
KKF EVP AS EIHIVIWERKIS Q VTDKE AACLP LKKTNDQHALSNE 
KPVSLTSCSVGDAASAETASVTHPKDISVAPRTLSQDTAVTHGD 
HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 

AE NTG I IjKTNTT .T . <? n F ^ I iM A Q QV Q a P r MR" VT . T rinrt TTUTi T e P Dcr» 
r\Eiiv 1UJ.UIV1L1 x * J 1 i»j^u JJJi'*/\o o v OrtJr v*ri oAJj J. c vUlof troy 

VV^^'NMQSVQLNTEDTVNTKS VNNTDATGLIQGVKSVEI EKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPISKPPAGPP 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGISSANH 
EDLVEGOIHKLRLKXRKKLKAEKKKLAALMSS PQSRTVRSENLE 
QVPQDGSPNDCESIEDLLNELPYPIDIANESACTTVPGVSLYSS 
QTHEB I LAELLS PTP VSTELS ENGEGDFR YLGMGDSH I PPPVPS 
E FND VSQNTHLRQDHN YCS P TKKNP CE VQ P DS LTNNACVRTLNL 
ESPMKTDI FDEFFSSSALtNALANDTLDLPHFDEYLFENY 


6818 


2 


240 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segment containing signal peptid"e~~ 
(A^Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=TrvDtODhan . Y=Tvrn<; ine* y-nnlrnnun * ci-«« 
Codon. /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6819 


1 


961 


DGGE/LHS/ATTEHKP/VQATPVNLT\TILTSTWQARLPQI 

GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSLEICIKACKNLAY 

GEEKKKKCNPYVKTyLLPDRSSQGKRKTGVQRNTVDPTFQETLK 

YOVAPAOLiVTPnTfOVQVWUT/^TT &DDWDT I > V\TT T r»T ^in.Tnnnno 
lyvnrnwuv ittyuy va V wriLAj A JLi/wKvrlAjCViXPLiATWDFEDS 

TTQSFRWHPLRAKADKYEDSVPQSNGBLTVRAKLVLPSRPRKLQ 
BAQEGTIX?PSLHGQLCLWI/3AKNLPVRPIX3TLNSFVKGCIiTLP 
DQQKLRLKS PVLRKQACPQWKHS FVFSGVTPAQIiRQSS LELTW7 
DQALFGMNDRLLGGT\RLGSKGDTAVGGDACSQSKLQWQKVLSS 

PNLWTDMTTiVT.M • 

f nun X. U\ I X JJ V l ' " 


6820 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLKVVRKHHRVIA 
GQ F FGHHHTDS FRML YDDAG VP I SAMFITPG VT PWKTTLPG WN 
GANN PA IR VFEYDRATLS tiXDMVTYFMNLSOANAQGTPR WELE Y 
QLTEAYGVPDA5AHSMHTVLDRIAGDQ3TLQRYYVYNSVSYSAG 
vLui^LanyHVLAnRyvuiUAi 1TCLYASGTTPVPQLPLLLMAL 
LGLCT 


6821 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVlJpiQSPQtf 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\S I VMD AETQKK FPS DLLLTS S SGELWRMVRIG 
GQPLG FDECGI VAQI AG PLAAADI SAYYISTFNFDHALVPEDGI 
GSVIEVLQRRQEGLAS 


6822 


1088 
^~ 654 


518 


EFDI YR/ E VGGE F VP V TRDDSSNGFPRTQHG PS PTVHP I QS PQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLG FDE CG I VAQI AG PLAAAD I S AY Y I S TFN FDHALV PE DG I 
GSVIEVLQRRQEGLAS 


6B23 




221 


PPKLLSRWARMGHGDBIV\LSDLNFPGLLHLPWGPWRSVQTAC 
G I PQLL E AVLKLL PLDT Y VES P AAVMELVP S DKERGLQT P VWTE 
YES I LRRAGCVRALAKI E RFE F YERAKKAFAWATGE TAL YGNL 
ILRKGVLALNPLL 


6824 


858 


104 


LLLAQR WG WG \ CCFFS LAVS VKMNVLL FAPGLLFLLLTQ FGFRG " 
ALPKLGICAGLQWLGLPFLLENPSGYLSRSFDLGRQFLFHWTV 
NWRFLPEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGESILS 
xjjjKi^^&i^KVFi'yfc'ijiPNQl VSTLiFTSNFIGICFSRSLHYQFYV 
WYFHTLPYLLWAMPARWLTHLLRLLVLGLIELSWNTYPSTSCSS 
AALHI CHAVILLQLWLGPQPFPKSTQHS KKAH 


6825 


3 


1173 


SSGEFGLQASD I MWTI S DTG WIL 1 1 LCS LME PW ALGACT F VHLL 
PKFDPLVILKTLSSYPIKSMMGAP1VYRMLLQQDLSSYKFPHLQ 
NCLAGGESLLPETLENWRAQTGLDIREFYGQTETGLTCMVSKTM 
KI KPG YMGTAAS C YDVQI IDDKGNVL PPGTEGD IG IR VKP I RP I 
GIFSGYVDNPDKTAANIRGDFWLLGDRGIKDEDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILALQFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 
PFGPLALPMDGYGDSLWEEHEYKFCLALVrSTKLYKVRC 


6826 


2304 


954 


LKTESFKPW/VNIALAFHLLGERASPNSFWQPYIQTLPREYDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKL PL KDS FT YED YR W AVS S VMTRQNQ I PTEDGS RVTLALI PLW 
DMCNHTNG L ITTG YNLEDDRCECVALQD FRAGEQ I YI FYGTRS N 
AE FVI HS G F FFDNNS HDR VK I KLG VS KS DRL YAM KAE VLARAG I 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
RI FTLGNS EFP VS WDNEVKLWTFLEDRASLLLKTYKTTI E EDKS 
VLKNHDLS VRAKMAI KLRLGEKEILEKAVKSAAVNRE YYRQQME 
EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVQDALNI 
REAI S KAXATENGLVNGENS I PNGTRS ENES LNQES KRA VBDAK 
GSSSDSTAGVKE 
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ID 

NO: 


~ Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, 13= 
Glutamic Acid, P=Phenylalanine, G=Glycine, 
H»Histidine, I»Isoleucine, K= Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S= serine, T=Threonine, v^Valine, 
W=Tryptophan, Y=Tyrosine, XaUnknown, *=Stop 
Codon, /opossible nucleotide deletion, 
\cpossible nucleotide insertion) 


6827 


1 


779 


SSVVEFGLSVU3GLFLLFVLBNMLGLLRHRGLRPRCCRRKRRNL 
ETRNLDPENGSGMALQPLQAAPEPGAQGOREKNSQHPPALAPPG 
HQGHSHGHOGGTDITWMVLLGDGLHNLTDGLAIGAAFSDGFSSG 
LSTTLAVFCHELPHELGDFAMLLQSGLSFRRLLLLSLVSGALGL 
GGAVLGVGLSLGPVPLTPWVFGVTAGVFLYVALVDMLPALFPSS 
GAPAyA\HVLLQGLGLLLGGCLMLAITLLEERLLPVTTEG 


6828 


3 


1654 


KSQHG/ WI LQLMHSCKEGYVKDLKGNPGLHRAMLDLDNGTRPSE 
LGHIiSQTASLKRGSSFQSGRDDTWRYKTPHRVAFVEKIiTKLVLS 
QLPNFWKLW I S YVNGSLFSETAE KSGQ1 ERS KNVRQRQNDPKKM 
I QEVMHS LVKLTRGALLPLS I RDGEAKQYGGWEVKCEL5GQWLA 
HAIQTVRLTHESLTALEIPNDLLQTIQDLILDLRVRCVMATLQH 
TAEEIKRLAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLKGVLE 
CKPGEASVFQQPKTQEEVCQLSINIMQVFIYCLEQLSTKPDADI 
DTTHLSVDYSS PDLFGS IHEDFSLTSEQRLLI VLSNCCYLERHT 
FLN I AEH FE KHNFQG I E KI TQVS MASLKBLDQRLFEN Y I E LKAD 
P I VGSLE PG I YAG YFD W KDCLP PTG VRNYL KEALVNI IAVHAEV 
FTISKELVPRVLSKVIEAVSEELSRLMQCVSSFSKNGALQARLK 
ICALRDTVAVYLTPESKSSFKQALEALPQLSSGADKKLLEELLN 
j KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 


MRMEAGEAAPPAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTLCR ' " 

EQKSFLSRLCQGEELQSDRDETGAYLIDRDPTYFGPILNFLRHG 

KLVLDXDMAEEGVLEEAE F YNI G PL I R 1 1 KDRME BKD Y TVTQVP 

PKHVYRVLQCQEEELTQMVSTMSDGWRFEQLVNIGSSYNYGSED 

QAEFLCWSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 

EBVEVEQVQVEADAQEK/CCYKPEAPGCEAPDHliQGLGVPI 


. 6830 


1 


939 


MEPGSVENLSIVYRSRDFLWNKHWDVRIDSKAWRETLTLQKQL 
RYR FPELADPDTCYG FRFCHQLDFSTSGALCVALNKAAAGSAYR 
CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 
EGSQGCENPKPSLTDLWLEJIGLYAGDPVSKVLLKPLTGRTHQL 
RV\HCSALGHPWGDLTYGEVSGREDRPFRMMLHAFYLRIPTDT 
ECVEVCTPDPFLPSLDACWSPHTLLQSLDQLVQALRATPDPDPE 
DRGPRPGSPSALLPGPGRPPPPPTKPPETEAQRGPCLQWLSEWT 
LEPDS 


6831 


3 


1087 


slffgsstpdnkvaeqbdletqpspsvekaVtvidpegtiptnf - 1 

NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKFISEVSREDYGKKEISGDSEEMNINSWTSADGENL 
EIQSYSLIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
3IFKEEPR3DQKQKSLLSFDVVDKVPQQPKSASSNFASKNITKE 
SEKPESIILPVEESKGSUDFSEDRLKKEMQNPTSLKISEEETK 
LRSVSPTEKKDNLENR\SYTL\AEKKVLAEKQNSV\APLELRDS 
NE IG KTQ ITLGS RS TE LKES KADAM PQHFYQNED YNERP K I 1 VG 
SEKEKDEKKKK 


6832 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VSLKKKRSEDDYEPIITYQFPKRENLLRGQQEEEERLLKAIPLF 
CFPDGNEWASLTEYPRETFS FVLTNVDGSRKI GYCRRLLPAGPG 
PRLPKVYCIISCIGCFGLFSKILDEVEKRHQ1SMAVIYPFMQGL 
REAAFPAPGKTVTLKSFI PDSGTEFI SLTRPLDSHLEHVDFSSL 
LHCLSFEQILQI FASAVLERXI IFLAEGLSTLSQCIHAAAALL Y 
PFSWAHTYIPWPESLLATVCCPTPFMVGVQMRFQQEVMDSPME 
EVLLVNLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINELKT 
AEQINEHVSGPFVQFFVKIVGHYASYIKREANGQGHFQERSFCK 
ALTSKTNRRFVKKFVKTQLFSLFIQEAEKSKNPPAGYFQQK1LE 
Y EEQKKQ/TETKG KNCE 2 RAWNKND 


6833 


1 


1129 


PLMTLS QCGG I PGHGHSHGGHGHGHGLP KG PR VKS TRPGS S D I N 
VAPGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
QVNGNLVREPDHMELEEDRAGQLNMRGVFLHVLGDALGSVIVW 
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Ammo acid segment containing signal peptide " 
{A=Alanine, C=Cysceine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, ^Threonine, V=>Valine, 
W=Tryptophan, Y=Tyrosine, X>Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAiiVFYFSWKGCSEGDFCVNPCFPDPCKAFVEIINSTHASVYEA 
G P CWV LYLDPTLCWM VC I LLYTT Y PLL KESAL I LLQT VPKQ I D 
IRNLIKELRNVEGVEEVHELHVWQLAGSRI IATAHIKCEDPTS Y 
ME VAKTI KDVFHNHGI HATT IQPE FAS VGSKSS WPCELACRTQ 
CALKQCCGT L PQAP SG KDAE KT PAVS ISCLELSNNLEKKPRRTK 
AENIPA\WIEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRLLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 

LCG S SAS AYG WH * RLT P WS PGGS * HM * SS KA P VTQARE VLVAG P 

CSKLVLSGARGIVGTTVQVLVEAQQPLLLLFTGVWGLNLRAGEE 

SRAL*LIEEVTQVRDAHLGNAVVGCAQCLSQGQVGSALAKALLB 

AAAAVRDCKEVLTVSGDKQQAEVSVRL*VRDVCVEEAGCVEFGQ 

AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHEliVAG 

QQDGEDQAARTRLLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 

LQQWGDAL+ARE*APQIIVLIiI,LEDVAQLRTGKKA*DLWDVE 
QLLRQL 


6835 


1 


834 


G I PAADR \ EASLELI KLDISRTFPNL^IFQQGGP YHJDMLHS iLG 
AYTCYRPDVG YVQGMS FIAAVL I LNLDTADAFI AFSNLLNXPCQ 
MAFFRVDHGLMLTYFAAFEVFFEENLPKLPAHFKKNNLTPDIYL 
IDWIFTLYSKSLPLDLACRIWDVFCRDGEEFLFRTALGILKLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDSREMREGKSVPPTLRLQREFALGTNQSPMPRPLCC 
FRLTPGQPRRTDAL 


6836 


1 


850 


MSCGRPPPDVDGMITLK\/VDNLTYRTSPD^LRRVFgKYGRVGDV " 

YIPREP.HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 

QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 

RRHRSRSRGPS CS RSRS RSR YRGS R YSRS PYSRS PYSRSR YSRS 

PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 

SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 

KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/GGPRTRRP\SGTSSSGSKASGP 
PNPPAQGDGTSLSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS PGQQQASGAAVGGSSAGET 
RGAPTPHEKALTS PSWGKGAELLLGDQPDLIGS LDGGAKS DSSS 
PNVGEFASDEVSTSYANEDEVSSS SDNPQALVKASRS PLVTGSP 
KLPPRGVGAGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDSELGSCCSEAVKSAMSTI 
□LDSLMAEHSAAWYMPADKALVDSADDDKTLAPWEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDI SNRFGTFVAALT 


6838 


16 


499 


LTDTP P P KTHM I HHS 1 3 D YKATLRC WALGFY PME I TLTWQQDEE 
DQTRDHELVETRPAGDGTFQKWAAVWPSGEE/Q/RYMCHVQHE 
GL PE P LTLRWEQSSQPT IPIVG I VAGLVLLGAWTGA WS AVMC 
RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGLS WPQ VKRLDALLS EP I P IHG 
RGNFPTLS VQPRQIRAGGPQHPGGAG \ IHVHR VRLHGS AASHVL 
HPESGLGYKDLDLVFRMDLRSEASFQtTKAWIiACLLDFLPAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDS VRRQFE FS I DS FQ 1 1 LDSLLL FG QCSSTPMS E AFH PT VTG 
ESLYGDFTEALEHLRHRVI ATRS PEE I RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AARRYACLVTLHRWNESTVCLMNHERRQTLDL I AALALQALAE 

QGPAATAA1»AWRPPGTI>3VVPATVNYYVTPVQPLLAHAYPTWLP 
CN 


6840 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLIRVDGKGSIKEL 
F PTG KQ LE PL VAPLADGKVAVGQPDLT WUJEEG I CTQKCALNW 
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Ammo acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, lULysine, 
L=Leucine. M=Methionine, N^Asparagine, 
PsProline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, VtValine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *«stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








rDIPVAMEHQPPYIIAVUPRYVEIRTFEPRLLVQSIELQRPRFl 
TSGGSNI I YVASNHFVWRL I PVPMATQIQQLLQDKQFSLALQLA 
EMKDDS DS E KQQQ I HH I KNLYAFNL FOQKR FDESMQV FAKLGTD 
f i HVMtjjjYPDLLPTDYRKQLQYPNPLPVLSGAELEKAHLALIDY 
LTQKRSQLVKKLNDSDHQSSTSPLMEGTPTIKSKKKLLQIIDTT 
LLKCYLHTNVALVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
LYEKKGLHEKAbQVLVDQSKKANSPLKGHERrVQYLQHLGTENL 
HLIFSYSVWVLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 
FKGLAI PYLEHIIHVWEETGSRFHNCLIQLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
PFDGLLEERALLLGRMGKHEQALFIYVHILKDTRMABEYCHKHY 
DWJKDGNKDVYLSLLRMYIiSPPSIHCLGPIKLELbEPKANLQAA 
LQVLELHHSKLDTTKALNLLPANTQINDIRIFLEKVLEENAQKK 
RFNQVLKNLLHAEFLRV\QEERILHQQVKCIITEEKVCMVCKKK 
IGNSAFARYPNGVWHYFCS \KEVNPADT 


6841 


1 


3206 


TPSTTGTKSWrPTSSVPSAAVTPLNESLQPlOGDYGVGSKNSKRA 
REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 
MCPETRLDRTGSSPTQGIVNKAFGINTDSLYHELSTAGSEVIGD 
VDEGADLLGEFSGMGKEVGNLLLENSQLLBTKNALNWKNDLIA 
KVDQLSGEQEVLRGELEAAKQAKVKLENRIKELEEELKRVKSEA 
IIARREPKEEAEDVSSYLCTESDKIPMAORRRFTRVEMARVLME 
RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
SL PAKYKQLS PNGGQEDTRMKNVP VP VYCRPLVEKDPTMKLWCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTLTTSKVVIIDANQPGrvVD 
QFTVCNAHVLCISSIPAASDSDYPPGEMFLDSDVNPEDPGADGV 
LAGITLVGCATRCNVPRSNCSSRGDTPVLDKGQGEVATIANGKV 
NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 
AQNGWliY VHS AVANWKKCLHS I KLKDS VLSLVHVKGRVLVALAD 
GTLAI FHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 
YKNKVHVIQPKTMQIEKSFDAHPRRESQVRQLAWIGDGVWVSIR 
LDSTLR L YHAHTHQH LQDVD I EP YVS KMLGTG KLG FS FVR I TAL 
ij vmo ^KijWVGTGNG WI S I PLTETWLHRGQ\ LLG \ LRANKTS P 

tsgeg\arpgg\iihvyg\ddssdraarsfipycsmaqaqlcfh 
ghrdavkffvsvpgnvlatlngsvldspaegpgpaapasevegq 

klrnvlvlsggegyidfrigdgeddeteegagdmsqvkpvlska 
ershi i vwqvs ytpe 


6842 


3 


926 


RCQQLS AT I LTDHQ YLERT PLCAILKQ KAPQQ YR I RAKLRS YKP 
RRLFQSVKLHCPKCHLLQEVPHEGDLDIIFQDGATKTPDVKLQN 
x o u x usa n.i w 1 1 junumjR KVAVHFVKNNG I L PLSNE CLLL I EGGT 

LSEICKLSNKFNSVIPVRSGHEDLELLDI,SAPFLIQGTVHHYGC 
KQWST* RS IQNLNSLVDKTSWI PSS VAEALGI VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQIPASEVLMDDDLQKSVDMIMDMFC 
P PG I KI DAYPWLE C F I KS YNVTNGTDNQ I CYQ I FDTTVAE DV I 


6843 


2 


851 


NHRKVLSGAKRYECNECGKSFAYTSSLIKHRRIHTGERPYECSE 
CGRSFAENSSLIKHLRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSIiIIHLRVHTGERPYECSDCGKSFAENSSLlKHLRVHTGE 
RPYECIDCGKSFRHSSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 
LLLQGQR VHTGSRC Y BCDKWG I FFS * NAS F FT * KSAPTEEVP FE 
CNECEKAFS PLSLVTTI FT 


6844 


244 


642 


EHQIAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKSEIYSFGIVLWEIATGDIPFQGCNSEKIRKLVAVKRQQE 



561 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seamen t cont-Ainina a , ~„„l' — 
3 "«v, ii v- »-«j«t.cixniiiy signal peptide 

<A=Alanine, C«Cysteine, D=Aspartic Acid, E» 

Glutamic Acid, F=Phenylalanine, G*Glycine, 

H=Histidine, I-Isoleucine, K=Lysine f 

L=Leucine, M=Methionine, N=Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S=Serine, T=Threonine, VnValine, 

"^Tryptophan, Y=Tyrosine, X=Unknovm, *«Stop 

Codon, /..possible nucleotide deletion, 

\-posaible nucleotide insertion) 








PLGEDCPSBLiREiIDBCRAHDPSVRPSVDEILKKLSTFSK*CIK 
I 


6845 


3 


1519 


VAVRDECYWRHVFWDODTAJMT.T ittt jtfir'U'ppfpAOTvnT p^nVn^^ ™ — 
GALENAQNLGYQGAKFAWESADSGLEVCPEDIYGVQEVHVNGAV 
GLAFELYYHTTQDLQLFREAGGWDWRAVAEFWCSRVEWSPREE 
KYHLRGVMS PDEYHSG VNNS VYTNVL VQNS LRFAAALAQDIjGLP 
I PSQWLAVADKIKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 

ypvpfslspdvrrknleiyeavtspogpamtwsmfavgwmelkd 
avrarglldrsfanmaepfkvwtenadgsgavnfltgmggflqa 
wfgctgfrvtragvtfdpvclsgisrvsvsgifyqgnklnfsf 

S K DSVTVTRUT 21 0 BOOMS DUT t?7\ t?T i.me*ncr\t r> t r iwummm... 

oowov * va v A>*iouavwAFnljiiAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSS EFPGRTFSDVRDPLQS Pt,WVTLGSSSP 
TESLTVDPASE*SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARIjSGKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFLKTIK*LNRLAEHP*YENEKLTKLRNTIMEQYTRTEESARG ■" 
1 1 FTKTRQSAYALSQW I TENEK FAE VG VKAHHLIGAGHS SEFKP 
MTQNEQKEVISKFRTGKINLLIATTVAEEGLDIKECN1VIRYGL 
VTNE I AM VQARG RARADE S TYVLVAHSGSGV I EHETVND FREKM 
KYKAIHCVQNMKPEEYAKKILELQMQSIMEKKMKTKRNIAKHYK 
nny&ux l r IjUIWCSVLACSGEDIHVIEKMHHVNMTPEFKELYIV 
RENKTLQKKCADYQINGEI ICKCGQAWGTMMVHKGLDLPCLKIR 
NFWVPKNNSTKKQYKKMVELPITFPNLDYSECCLFSDED 


6847 


1450 


348 


SMCWNSDRLEMPLIDLALILYPPSYVPYTGHLSDDSL^RKYCLT * 

WFEDALNGVI>*RAEAIQPHCV^GDRMEKFRQKYWNKLQTLRQQ 

PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 

v vk^s>uuj\ij\3 wac,KyJaAJUVK^jjiiAGNVFDWGAKAvs 
YFGFEEAKRKLQERPWLVDSYSEWLQRLKGPPHKCALIFADNSG 
IDIILGVFPFVRELLLRGTEVILACWSGPALNDVTHSESLIVAE 
R I AGMD P WHS AliREERLLLVQTGS S S PCLDLS RLDKGLAALVR 
ERGADLWI EGMGRAVHTNYHAAltRCESLKLAV I KNAWJjAERLG 
GRLFS VI FKYEVPAE 


684B 


19 


16 


ru innH^uiAjxjuii vijjiNi'Ais-Krg iijbJjJu^ijKsiiQsnT *»HDADSND 
LKVIIISAEGPVFSSGHDLKELTEEQGRDYHAEVFQTCSKVMMH 
IRNHPVPV1 AMVNGLATAAGCQLVAS CDIAVAS DKSS FATPGVN 
VGL PCS TPG VALARAVPRKVALEML FTGE P I S AQEALLHGLLNK 
WPEAEIiQEETMRXARKIASLSRPWSIjGKATFYKQLPQDLGTA 
YYLTSQAMVDNLALRDGQEGITAFLQKRKPVWSHEPV*VEH 


6849 


70 


821 


SLGVDGSCLEQGSPAPRPQTDTSP* p vgnwatOqedlyhqs yec " 
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELLSK 

pkfsgvekiktigstymaatglnatsgqdaqqdaerscshlgtm 

VEFAVALGS KLDVI NKHS FNNFRLRVG LNHGP WAGVIGAQKPQ 
YDI WGNTVNVAS RMESTGVLGKTnVTPPTawnT.ncT /ivt^vc D n 
VIKVKGKGQLCTYFLNTDLTRTGPPSATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQELHLFMIiSGVPDAVFDLTD 
LDVLKLELI PEAKI PAKISQMTNLQELHLCHCPAJCVEQTAFS FL 
RDHLRCLHVKJ^DVAEIPAWVYIiiKWLRELYLIGNLNSENNKMI 
GLES LRELRHLKI LHVKSNI/TKVPSN I TDVAPHLTKLVIHNDGT 
KLLVLNSLKKMMNVAELELQNCELERI PHAI FSLSNLQELDLKS 
NNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESL 
YFSNNKLESLPVAVFSLQKLRCLDVSYNNISMIPIEIGLLQNLQ 

h tihi tgnkvd i l pkqlfkc i klrtln lgqnc i ts lpe kvgqlsq 
ltqlelkgncldrlpaqu;qcrmlkksglvvedhlfdtlplevk 
ealnqdinipfangi 


6851 


1765 


660 


VSAQVSAREGENCI^WNLADSSQESYKSLEEAEDCYPPSLIjTLD 

i^lfnqveg^pllscpkagtdlskksrj^vgwmaaglmigaga 
cycvykltigrddsekleeegeeewdddqeldeeepdiwfdfet 
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Anu.no acid segment containing signal peptide""" 
(A=Alanine, C*Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine. G-Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, ' 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MARPWTEDGDWTBPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE " 
H KNTWS AQNCKNGS CVLDLS KCLFI QG KLLFAE PKDAG F P FSQD 
INSHLASLSMARNTS PTPDPTVREALCAPENLNAS I ESQGQIKM 
YIWEVCKETVSRCCNS FLQQAGLNLLISMTVINNMLAKSASDLK 
FPLI SEGSGCAKVQVLKPLKGLSBKP VLAGELVGAQMLFS FMS L 
P I RNGNRB I LLETPAP 


6852 


1 


407 


RTRGI5ETYANFIKHNDGKNIFYAARTPATLFAVMFAMYI ISGLT * 
G FI GLNS I AVLCNL VMG LAL I FLCTWAYVKY <?C5K PP r t r tv t nn 

IAETLWEQVLKPLGDNLMSENI RQS VTNS IKAGLTDQVS HHARL 
KTD 


6853 


3 


469 


GDSCAVCIELYKPNDLVRILTCNHIFriKTCVDPWLLEHRTCPH5 
KCDILKALGIEVDVEDGSVSLQVPVSNEIFNSASSHEEDNRSET 
ASSGYASVQGTYEPPLEEHVQSTNE S LQLVNHEANS VAVDVI PH 
VDNPTFEEDETPNQETAVREI KS 


6854 


1148 


585 


HESYIGTFDPGELCVCAAIQWLQDNSASYFLNRKLVYEPSTQAK " 
PVKNTFLRMWIYSHHIYQQDLRKKILDVGKRLDVTGFCMTGKPG 
1 1 C VEGFKEHCEE FWHT I RY PNW KH I S CKHAE S VE TEGNGEDLR 

LFHSFEELLLEAHGDYGLRNDYHMNLGQFLEFLKKHKSEHVFQI 
LFGIESKSSDS 


6855 


1913 


1148 


GRVGGRVGRI CSPLSGANEYIASTDTLKTEEVLLFTDQTDDLAK "" 
EEPTSLFQRD3ETKGESGLVLEGDKEIHQIFEDLDKKLALASRF 
YI PEGCIQRWAAEMWALDAIiHREG I VCRDLNPNNI LLNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITEETEACDWWSL 

GAV IiPRT»TiTfiff TT ■V'P'PHD Hr2 T MTU r PTrATMr>Tr'M'irr<or?7\ t-1 -mn-r 
■ *■ c>t»ijiui\.i ljv m^ntrnsj in i til I uNMFbW VoEIsARSIjIQQIj 

LQFNPLERLGAGVAGVEDI KSHPFFTPVDWAELMR 


6856 


1617 ■ 


■ 997 


VTQLYVSVDASTKDSLKKXDRPLFKDFWOXJFLDSLKALAVKQQR ' 
TVYRLTLVKAWNVDELQAYAQLVSIiGNPDFIEVKGVTYCGESSA 
SSLTMAHVPWHEEWQFVRELVDLIPEYEIACEHEHSNCLLIAH 
RKFKIGGEWWTWINYNRFQELIQEYEDSGGSKTFSAKDYMARTP 
HWALFGASERGFDPKDTRHORKNK<5 K at qnr 


6857 


1 


617 


KGPEATAMVCVCSHPNCRQNHIKPSHSAAQTWCGSPTPASAPNH 
ICI^AMEQGKTLPSATEDAKEEGLEAQISRLAELIGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYIjRGT 
TGVRNCFHITAVRLSDGFTFVIYEFWETEEAWKRHLQSPLCKAF 
RHVKVDTIiSQPEALSRILVPAAWCTVGRD 


6858 


2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALLDSGIRISSVFASPA " 
LRC VQTAKL ILBELKLEKKIKI RVE PG I FEWTKWE AGKTT P TLM 
SLEELKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
IVNTCPQDTGVILIVSHGSTLDSCTRPLLGt.PPRFrv;npanT ud 
KI PSLGMCFCEENKEEGKWEliVNP PVKTLTHGANAAFNWRNW I S 
GN 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDTToqpqqTrtT t vcn — 
KTNSVESLPELLTSDSEGS YAGVGS PRDLQS PDFTTGFHSDKIE 
AKVKP YVNGTS P VYSR EDL KPWEKSPILKI S APQ PIP SNR I DTT 
SSASWVAGSFSPVSPPWDLRTIME I EESRQKCGATPKSHLGKT 
VSHGVKLSQKQRKMIALTTKENNSGMNSMErVLFTPSKAPKPVN 
AWASSLHSVSSKSFRDFLLEEKKSVTSHSSGDHVKKVSFKGIEN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPV7LSSSVTAPSM 
VAPVTFASIVEEELQQEAAIjIRSREKPIjALIQIEEHAIQDLLVF 
YEAFGN PEEF VI VERTPQG P LA VPMWNKHGC 


6860 


1889 


1515 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ " 
DTGLTI LQVNNWF INARRI I VQ PMI DQSNRAVS QGAAYS PEG Q P 
MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSOGAAYSPEGQP 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R*=Arginine f 
S=Serine, T^Threonine , VaValine, 
WsTryptophan, Y^Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGSFVLDGQQKMGIRPAGPHSGMGMNMGMDGQWHYM 


6862 


2 


471 


EE I DRE FHNKLKLKEDfCLEKQE KP VNGEDKGDSGVDTQNS EGNA 
DEEDPLGPNCYYDKTKSFFDNISCpDNRBRRPTWAEERRLNAET 
FGI P LRPNRGRGG YRGRGGLGFRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGRE FADFE YR KTTAFG P 


6863 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCKQVCSTVGGS 
AI CS CFPGYAIMADGVSCEDQDECLMGAHDCS RRQFCVNTLGSF 
YCVNHTVLCADGY ILNAHRKCVDI NECVTDLHTCSRGEHC VNTL 
GSFHCYKALTCEPGYALKDGECEDVDECAMGTHTCJQPGFLCGNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSbSEPCRPGFSCI 
NTVGS YTCQRNPL ICARG YHASDDGTKCVDVN EC ETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
C ENTLGS YRCS CASG FLLAADGKR C EDVNECE AQRCS Q EC AN I Y 
GSYQCYCRC^YQLAEDGHTCTDIDBCAQGAGILCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDECALGTHNCSEAETCHNIQGS 
FRCLRFECP PNYVQVS KTKCERTTCHDFLECQNS PARITHYQLN 
FQTGLLVPAH I FRIGPAPAFTGDTI ALNI IKGNEEGYFGTRRLN 
AYTG VVYLQRAVLE PRD FALD VEMKbWRQGS VTTFLAKMH I FFT 
TFAL 


6B64 


2 


2933 


LADS S PSNLQ III KELLSMHHQPDPALTKEFD YLPP VDSRS S SG 
FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 
FYQVQSLFGHLMES KLQY YVPENFWKI FKMWNKELYVREQQDAY 
EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 
ERBEAFMALNLGVTSCQSLE I SLDQ FVRGEVLEGSNAY YCE KCK 
EKRI TVKRTC I KSLPSVLVI HLMRFG FDWBSGRS I KYDEQ I RFP 
WMLNMEP YTVSGMARQDSS S EVGENGRS VDQGGGGS PRKKVALT 
ENYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 
EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMLFY 
QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 
PHRPNNDRLS I LTKLVKKGEKKGLFVE KMPARI YQMVRDENLKF 
MKNRDVYSSDYFSFVLSLASLNATKLKKPYYPCMAKVSLQLAIQ 
FLFQTYLRTKKKLRVDTEEW IATIE ALLS KSFDACQWIiVBYFI S 
SEGRELIKIFLLECNVREVRVAVATILEKTLDSALFYQDKLKSL 
HQLLEVLLALLDKDVPENCKNCAQYFFLFNTFVQKQGrRAGDLL 
LRHSALRHMI S FLLGASRQNNQIRRWS S AQAREFGNLHNTVALL 
VLHSDVSSQRNVAPGIFKQRPPISIAPSSPLLPLHEEVEALLFM 
SEGKPYLLEVMFALRELTGSLLALIEMWYCCFCNEHFSFTMLH 
FIKNQLETAPPHELKNTFQLLHEILVIEDPIQVERVKFVFETEN 
GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYFKENSHH 
WS WAVQWLQKKMS EHYWTLQSN VSNETSTGKTFQRTI SAQDTLA 
YATALLNEKEQSGSSNGSESSPANBNGDRHLQQGSESPMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAAD YNQALGTCR LAGTALC VAAGVLLAI CL FWAM trwt, q 0 
DTKAEPLDPEADSH VE VFGDE PEQQLS P I FRNASGQS W FS P PAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTLYGLRATCMRDLDWAWINAVSAFKALEQDLPVNIKF 
IIEGMEEAGSVALEELVEKEKDRFFSGVDYIVISDNLWISQRKP 
AITYGTRGNSYFMVEVKCRDQDFHSGTFGGILHEPMADLVALLG 
SLVDSSGHILVPGIYDEWPLTEEEINTYKAIHLDLEEYRNSSR 
VEKFLFDTKEEILMHLWRYPSLS IHGIEGAFDEPGTKTVI PGRV 
IGKFS I RLVPHMNVSAVEKQVTRHLEDVFSKRNS SNKMWSMTL 
GLHPWI AN I DDTQYLAAKRAI RTVFGTEPDM I RDGSTI P I AKM F 
QE I VHKS WLI PLG AVDDGEHSQNE KI NRWN Y I EGTKL FAAF FL 
EMAQLH 
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Ammo acid segment containing signal peptide" - 
(AoAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine f G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Hethionine, N=Asparagine , 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V= Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /«possible nucleotide deletion, 
\«possible nucleotide insertion) 


6867 


2833 


1704 


GTRIMSQPKQKELAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP *~ 
LQSAESSPTAGKKLPEVPPSEEEEQEAWVNALLGRIFWDFLGEK 
YWSDLVS KK I QMKL5 K I K LP YFMNELTLTBLDMGVA VPK I LQAF 
KPYVDHQGLWIDLEMSYNGSFLKTLETKMNLTKLGKEPLVEALK 
VGE I GKEGCR PRAFCLADS DEESS S AGSSEEDDAPEPSGGDKQL 
LPGAEGYVGGHRTSKIMRFVDKITKSKYFQKATETEFIKKKIEEx 
VSNTPLLLTVEVQECRGTIAVNIPPPPTDRVWYGFRKPPHVELK 
ARPKLGEREVTLVHVTDWIEKKLEQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTS CLLKDPPVEAADQP 


6868 


1 


346 


RPTKPPTRPEEIKNLILPYISDMNFVQDI^EDFYELFKf DKGFD 
KATFESQMSVMRGQILNLTQALRDGKSPFQLVQIPCVIVERSQG 
GSQGR I VHLSNS FTQTVNCRKP FFS S W 


6869 


3 


1619 


MYMERMDKRALISFWESVEHLKNANKNEIPQIiVGElYQNFFVES 
KEISVEKSLYKEIQQCLVGNKGIEVFYKIQEDVYETLKDRYYPS 
F I VSDLYEKLL I KEEE KHAS QM I SNKDEMG PRDE AG EEA VD DGT 
NQ I NEQAS FAVN KLRE LN E KL E Y KRQ ALNS IQNAPKPDKK I VS K 
LKDEIILIEKERTDLQLHMARTDWWCENLGMWKASITSGEVTEE 
NGEQLPC YFVMVS LQEVGG VETKNWTVPKRLS E FHNLHRKLSEC 
VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 
LCQSEALYAFLSPSPDYLKVIDVQGKKNSFSLSSFIjERLPRDFF 
S HQEE ETE EDSDLS DYGDD VDGRKDALAEPCFMIi IGEI FE LRGM . 
FKWVRRTLI ALVQVTFGRT INKQI RDTVSWI FSEQMLVYY I NI F 
RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSL 
VGQQNARHGIIKIFNALQETRANKHLLYALMELLLIELCPELRV 
HLDQLKAGQV 


6870 




1566 


MAAWAATRWWQLLLVLSAAGMGASGAPQPPNILLLLMDDMGWG 

dlgvygepsretpnldrmaaegllfpnfysanplcspsraai.lt 

GRLPIRNGFYTTNAHARNAYTPQEIVGGIPDSEQLLPELLKKAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQ 
ARHHPFFLYWAVDATHAPVYASKPFLGTSQRGRYGDAVREIDDS 
IGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLC 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGS IMDLFTTS LAL 
AGLT P PS DRAI DG LNLLPTLLQGRLMDR P I FY YRGDTLMAATLG 
QHKAH FWTWTNS WENFRQGI DFCPGQNVSG VTTHNLEDHTKLPL 
IFHLGRDPGERFPLSFASAEYQEALSRITSWQQHQEALVPAQP 
QLNVCNWAVMN WAP PG CEKLG KCLT P PES I PKKCLWSH 


6871 


209 


1126 


RMSLNPPIFLKRSEENSSKFVETKQSQTTSIASEDPLQNLCLAS 
QEVLQKAQQSGRS KCLKCGGSRM FYCYTCYVP VENVP I EQI PLV 
KLPLKIDirFCHPNETDGKSTAIHAKLLAPEFVNIYTYPCIPEYE 
EKDHE VALI PPGPQS I S I KD I S FHLQKR I QNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
DILKEKYRGQYDNLLFFYSFMYQLIKNAKCSGDKETGKLTH 


6872 


880 


459 


FGLLMWLS LI FMKGNC VREDLI FNFLFKLGLD VRETNGL FGNT 
KKLI TEV FVRQKYLE YRRI P YTE PAE YE FLWGPRAFLETSKML V 

LRFLAKLHKKDPQSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKKEDSHCN 
IIHTEIFGFSNNYWELRRRRPKLKKLKKLLMENPYEGPDSQKEK 
DSNSSKYTTEDLLDQIQASEEEIMTQLQVLNACKIGGYWRILEF 
DYEMKLLNHVTOLVDSESWSFGKVPLNTCLQELGPLEPEEMIEH 
CLKCYGKKYVDEGEVYFELDADKICRAAARMLLQNAVKFNLAEF 
QEVWQQSVPEGMVTSLDQLKGLALVDRHSRPEIIFLLKVDDLPE 
DNQERFNSLFSLREKWTEEDIAPYIQDLCGEKQTIGALLTKYSH 
SSMQNGVKVYNSRRPIS 
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Amino acid segment containing signal peptide " 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, Phenylalanine, G=Glycine. 
H^Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=A$paragine, 
P=Proline, Q=Glutamine„ R=Arginine, 
S=Serine, T«Threonine, V= Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 


6874 


1 


307 


DS I ADHVNSAAVNVEEGTKNLGKAAKYKLAAIjP vagaliggmvg 
G P I G LLAGF KVAG I AAALGGG VLG FTGG KL IQRKKQ KMME KLTS 

S CPDliPSnTDlOf PC 


6875 


1688 


349 


VIGTGERGNSASEKWEIMFNEELGDPFIIIHSISLLNAEEHSIA 
TLLLR I EKBELDMKGSG FYVSIiE WVTI S KKNQDNKKYE 1 1 KRD I 
LRGKSVPHYAAIEPDGNGIiMIVSYKSLTFVQAGQDLEENMDEDI 
SEKIKEPLYYWQQTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VLKDHQFLEGKLYSS IDHESSTWI IKESNSLEISLIKKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEECDIFFEESSSLCRFDGNTLKTTHWNLGSNQYLFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QASKRDKKFFACAPNYSYAALCECLRRVFIYRQPAPMSTVLYNR 
KEGRQVGQVAKQQVAS LETNDPI LGFQATNERLFVLTTKNLFLI 
KVNTEN 


6876 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS 
LHTKPRMPPCDFMPERYQVIFLVNSGSEANELAMLMARAHSNNI 
DIISFRGAYHGCSPYTLGLTNVGIYKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 
VAKS I AGFFAE PIQGVNGWQYPKGFLKEAFE LVRARGG VCI AN 
EVQTGFGRLGSHFWGFQTHDVLPDI VTMAKG IGNGFPMAAVI TT 
PEIAKSLAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYHLLKFAKLRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 
PRBEVNQIHEDCKHMGLLVGRGS I FSQTFRIAPSMCITKPEVDF 
AVEV FRSAIiTQHMERRAK 


6877 


1 


778 


GTSPSPARAYAPPTERKRFYQNVSITQGEGGFEINLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDTIKYYTMHLTTLCNTSLDN 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVELQRNEWDPI I 
EWAE KRYGVE I SSSTS IMGPS IPAKTRE VLVSHLAS YNTWALQG 
I E FVAAQLKSMVLTLGL I DLRLT VEQAVLLSRLEEE YQ I Q KWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


QTLQG DF KNRAEM I DFN I R I KNVTRSDAGK YR CE VS APS EQGQN 
Ltcnui VI LiEVXjVAPAVPSCErVPSSALSGTVVELRCQDKEGNPAP 
E YTW FKDG IRLLE N PRLGS QSTNS S YTMNTKTGTLQ FNTVS KLD 
TGEYSCEARNSVGYRRCPGKRMQVDDLNISGI IAAWWALVIS 
VCGLG VC YAQRKG YFS KETS FQKSNS S S KATTMS ENDF KHT KS F 
II 


6879 


3 

* 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRVK 
rd1 '' 3 * lu y» iuai/\aivvIjMJJI>1 1 vNiFAJjFEVISHSFVRKItAPNE 
FPHKLY IQNYTSAVPGTCLTIRKWLFTTEEEI LLNDNDLAVTYF 

FHQAVDDVKKGYIKAEEKSYQLQKLYEQRKMVNYLNMLRTCEGY 

NET T pp HC ACTDS H R KflHVT T1V TQT TH B*VT .u J\ rvrrpp r\T nvim rm 
4 ■*■ * i*-k»/-»v>i>i3i\.rtrv,v3n v x x Hlol irlr AJ_irl/\V-. 1 iitiov-Li-KNQ V IA 

FEWDEMQRWDTDEEGMAFC F E YARGEKKP RWVK I FT P Y FN YMHE 
CFERVFCEIiKWRKEEY 


6880 


2110 


1437 


RKDN CTA KE W T F P EAK WN TTAR VFS H I RLGMGH VL 1 1 VQ C F I SS 

MANIYNEKILKEGNQLTESIFIQNSKLYFFGILFNGLTLGLQRS 

NraQIJCNCGFFYGHRAFSVALIFVTAFQGLSVAFILKFLDNMFH 

Vl^QvTTVIITTVSVLVFDFRPSLEFFLEAPSVLLSIFIYNAS 

KPQVPEYAPRQERIRDLSGNLWERSSGDGEELERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDSKWEDIHVI'I-GALKMFFRELPEPIjKTFNHFNDFVNAIKQEPR 
QRVAAVKDLIRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 
IAI VFGPTLLKPE KETGNI AVHTVYQNQ I VEL I LLELSS I FGR 


6882 


1 


850 


G I PE AQLW I YP VKS CKGVP VS EAE CTAMGLRSGNLRDRF WliVIN 
QEGNMVTARQBPRLVLISLTCDGDTLTLSAAYTKDLLLPIKTPT 
TNAVH KCRVHGLE I EGRDCGE ATAQW I TS FLKS QP YRLVHFE PH 
MRPRRPHQIADLFRPKDQlAYSDTSPFLlLSEASIiADI^TSRLEK 
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corresponding 

to first 

tiHiiJlO dClu 

amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F»Phenylalanine, G=Glycine f 
H-Histidine, 2=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline. Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X~ Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








KVKATNFRPNI VISGCDVYAEDS WDELLIGDVELKRVMACSRCI 
LTTVDPDTGVMSRKEPLETLKSYRQCDPSERKLYGKSPLFGQYF 
VLBNPGTIKVGDPVYLLGQ 


6883 


2794 


2256 


NSKLKLNQNLKLFITLTYQVLSLHGWGPdlHLOKEGAFPVTQNR 
ALQLLYDLR YLNI VLTAKGDEVKSGRSKPDSRI EKVTDHLEALI 
DPFDLDVFTPHLNSNLHRLVQRTSVtiFGIiVTGTENQLAPRSSTF 
NSQEPHNILPLASSQIRFGLLPLSMTSTRKAKSTRNIETKAQYD 
AKC 


6884 


2 


99 


EFERVTABAVKPRETSEPRAAAQRFCEKFPFL 


6835 


297 


1554 


STGQFWHVTDLHLDPTYH I TDDHTKVCASSKGANASNPG PFGDV 
LCDS PYQLILS AFDFIKNSGQEAS FMIWTGDS P PHVPVPELSTD 
TV I NV I TNNTTT I QS LFPNLQVF PALGNHDYW PQDQLS WTS KV 
YNAVANLWKPWLDEEAI STLRKGG F YSQKVTTNPNLR IIS LNTN 
LYYGPNIMTLNKTDPANQFEWLESTLNNSQQNKEKVYIIAHVPV 
GYLPSSQNITAMREYYNEKLIDIFOKTSDVIAGQFYGHTHRDSI 
MVLS DKKGSPVNS LFVAPAVTPVKS VLEKOTNNPG IRLFQYDPR 
DYKLLDMLQYYLNLTEANLKGESIWKLEYILTQTYDIEDLQPES 
LYGLAKQFTILDSKQFIKYYNYFFVSYDSSVTCDKTCKAFQICA 
IMNLDNISYADCLKQLYIKHNY 


6886 


2 


1341 


QCGG I PGREGGSS RPLEEGTGSS PACVRG AAPGSEDAFYPTRAK 
QARVSQELKKAAKRTVSISEGPDTLGDGMRERRETLALAPEPEP 
LE KEACEKW KR P FRS AS ATS LTLSHCVDWKGLLD FKKRRGHS I 
GGAPEQRYQI I PVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS 
LLATGGADRL I HLWNWGS R LE ANQTLEG AGG S ITS VDFDPSG Y 
QVLAATYNQAAQLWKVGEAQSKETLSGHKDKVTAAKFKLTRHQA 
VTGSRDRTVKEWDLGRA YCSRTINVLS YCNDWCGDHI I ISGHN 
DQKIRFWDSRGPHCTQVIPVQGRVTSLSLSHDQLHLLSCSRDNT 
LKVI DLR VSNI RQVFRADGFKCGSDWTKAVFS PDRS YALAGSCD 
GALYIWDVOTGKLESRLQGPHCAAVNAVAWCYSGSHMVSVDQGR 
KWLWQ 


68B7 


104 7 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGGCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQS PRS PAGP FRGGTGWWPE PAVCLCVAVGPQRLS S PGLVY 
NASGS EHC YD I YRL YHS CADP TGCGTG P D ARAWD YQ ACTE I NLT 
FASNNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWLLTSFWGG 
DLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAHHLD 
LRASHPEDPASWEARKLEATI I G EW VKAARREQQ P ALRGGPRL 
SL 


6888 


1 


992 


FVAYVKKEI PHI WTHCLLNPHALVI KTLPTKLRDALFT WRVI 
NFIKGRAPNHRLFQAFFEEIGIEYSVLLFKTEMRWLSRGQILTH 
I FEMYEE INQFLHHKSSNLVDGFENKEFKIHLAYLADLFKHLNE 
IiSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EEIIVSDNEGIFIAAEITLHLQQLSNFFHGYFSIGDLNEASKWI 
LDPFLFNIDFVDDSYLMKNDLAELRASGQILMEFETMKLEDFWC 
iMrir«ii/\A>iHJj£iiijrlf rAl X ilj^liljvr oilFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLENQIKEEREQDNSESPNGRTSPLVSQNtJEQGSTLRDLLTTT 
AGKLRVGSTDAGIAFAPVYSMGAPSSKSGRTMPNILDDIIASW 
ENKI PPSKTS KINVKPELKEEPEES I ISAVDENNKLYSDI PHSW 
ICEKHILWLKDYKNSSNWKLFKECWKQGQPAVVSGVHKKMNISL 
WKAESISLDFGDHQADLLNCKDS I ISNANVKEFWDGFEEVSKRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLASHLPGFFVRPDLGPRLCSAYGWAAKDHDIGTTNLHI 
BVSDWNILVYVGIAKGNGILSKAGILKKFEEEDLDDILRKRIiK 
DSSEIPGALWHIYAGKDVDKIREFLQKISKEQGLEVLPEHDPIR 
DQSWYVNKKLRQRLLBEYGVRTWTLIQFLGDAIVLPAGAIiHQVQ 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue -of 
amino acid 
sequence 


| Predicted end 
nucleotide * 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


w o^ywcui. wiiLaiiiing signal, peptide 
(A== Alanine, C~Cysfceine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hietidine, I=Isoleucine, K=Lysine. 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V= Valine 
W=Tryptophan, Y^Tyrosine, X=tJnknown, *«stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6890" 






N FHS CI QVTEDFVS P EHLVES FHLTQELRLLKEE IN YDDKLQ VK 
NILYHAVKEMVRAIiKIHEDEVDDMEEN 




3 


667 


THACGMWI PLYLHRALWHKTARTPKiQ"ppfy a vno i TD^KYnffn — 
TG FLG VDTGAG ATRM CRLKTQRADPLVCAVGMLGSA IFICLIFV 
AAKSS I VGAYI CI FVGETLLFSNWAITAD I LMYWI PTRRATAV 
ALQSFTSHLLGDAGSPYLIGFISDLIRQSTKDSPLWEFLSU5YA 
LMLCPFVWIX5GMF FLAT ALFFVSDRARAEQQWQLAMP PAS VK 


6891 


1980 


1262 


LRIHQELL5KELKLI^GITIESIIHIGLAAGKEQFM6DASNVMQ 
xjjjua. i yonu i wncajiyn t* e, vkuaaay GLGVMAQFGGDDYRSLCSE 
AVPLIjVKVIKPJUiSKTKKNVlATENCISAIGKILKFKPNC^VNVD 
EVLPHWLSWLPLHEDKEEAIQTLSFLCDLIESNHPWIGPNNSN 
LPKIISIIAEGKINETINYEDPCAKRLANVVRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


6S92 
" 4893 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV 
FALRAFNVELAQVKDSVSEKTlGIiMRMQFWKKTVEDIYCDNPPH 
\jr v*ii aijWRAVIU<HNLTKRWI^KIVDEREKNLDDKAYRNI KELE 
NYAENTQSSLLYLTLE I LGI KDLHADHAASHI GXAQG I VTCLRA 
TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDXNVRDVIYDIA 
&v**nunurixuw.& tnMV i'V i\Ar PAFLQTV 9 LED FLKKI QRVD FD 
I FHPSLQQKNTLLPLYLYIQSWRKTY 


6 894" 


1 


842 


dgerksmsvertfseinkaeeqyslcqelcselaqdlqickrLkg 

RTVTIKLKNVNFEVKTRASTVSSWSTAEEIFAIAKELLKTEID 
ADFPKPLRLRLMGVRISS FPNEEDRKHQQRS I IGFLQAGNQALS 
ATECTLEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
EAVNKQSFQTSQPFQVLKKKMNENLEISENSDDCQILTCPVCFR 
AQGCI SLEALNKHVDECLDG PS I S ENFKMFS CSHVSATKVNKKE 
NVPAS S LCEKQDYEAH 




1742" 


1463 


TTLCK P LV PR EHQ F Y ETL P AEMRKFT P.QYKG KS QLL EG L PHWRG 

DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895 


2379 


478 


VTYVELCDLASPTALLiMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLLALL YALASHKAC KLA I LHL INGT I KGDE RYAE I FQDLLAL 

VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 
^WJ-»«ouri>irasjjra i & IUIJ^IjIiATIjANSESSYNCLLTCVRTMMFL 
AEHDYGLFHLKSSLRKNSSALHSLLKRWSTFSKDTGELASSFL 
EFMRQILNSDTIGCCGDDNGLMEVEGAHTSRTMSINAAELKQLL 
QSKEESPENLFLELEKLVLEHSKDDDNLDSLLDSWGLKQMLES 

SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQLKS 
MWFTPFOAEEIDTDLDL.VTCVnT.TPT.QPvnr'cnTrriT ucdt r»r»r>rrr 

SEPSSPGRTKTTKGFKLGKHKHETFITSSGKSEYIEPAKRAHVV 
PP PRGRGRGGFGQG I RPHD I FRQR KQNTS RP PSMH VDDFVAAES 

KEWPQDGIPPPKRPLKVSQKISSRGGFSGNRGGRGAFHSQNRF 
FTP PAS KGNYS RREGTRGSS WS AQNT PRGNYNES RGGQSN FNRG 

PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 
KFVSGGSGRGRHVRSFTR 


6896 


1 


555 


GNIVIQKKKYNKQHIIPLENVTIDSIKDEGDLRNGWLIKTPTKS 
FAVYAATATEKSEWMNHINKCVTDLLSKSGKTPSNEHAAVWVPD 
S E ATVCMRCQKAKFT PVNRRHH CR KCG FWCGPCSE KRFLLPS Q 
S S KP VR I CD FC YDLLS AGDMATCQ P AR SDS YSQSLKS P LNDMS D 
DDDDDDSSD 


6897 


3 


920 


U UGLMH E WNG LMERPDW ETAI QKPLCS L P AGSGNALAAS LNHY 
AGYEQVTNEDLLTNCTLLLCRRLLSPMNLLSLHTASGLRLFSVL 
SIAWGFIADVDLESEKYRRI^EMRFTLGTFIiRLAALRTYRGRLA 
YLPVGRVGSKTPASPVWQQGPVDAHLVPLEEPVPSHWTWPDE 
DF VL VLALLHS HLGS EMFAAPMGRCT^AGVMHL F YVRAG VSRAML 
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sequence 


Amino acid seamen t containing" ojonal ' j— — 
{A=Alanine, C=Cysteine f D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
K-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine» 
P=Proline, Q=Glutamine, R*Arginine, 
S= Serine, TaThreonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LRLFI^EKGRHMEYECPYLVYVPWAFRLBPKDGKGVFAVDGE 
LMVS EAVQGQVHPNY FV7MVSGCVE P PPSWKPQQMPPPEE PL 


6898 


919 


346 


QKTVTAVASLLKGRQG I YTENERRMGAVI K I R FFKIMLVLI ICW' 
LSN I INESLLFYLEMQTDINGGS LKP VRTAAKTTWFIMG I LNPA 
QGFLLS LAFYG WTGCSLGFQS PRKE I QWES LTTS AAEGAH PS PL 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIElHTAqpqr 
NKNEGD PALPTHGDL 


6899 
6906 


120 


827 


MKVRKNNDAYLLDKNKINMDCFISCFFKKMI.TTTMPQMqr' rrcr 
LE HGEE YT FSL PCAYARS I LTVPWVE LGG KVS VNCAKTG YS AS I 
TFHTKPFYGGKLHRVTAEVKHNITNTWCRVQGEWNSVLEFTYS 
NGBTKYVDLTKLAVTKKRVRPLEKQDPFESRRltWKNVTDSLRES 
E IDKATEHKHTLEE RQRTE ERHRTETGT PW KT KY FI KEGDG WVY 
HKPLWKI IPTTQPAE 




3 


451 


TEVLGSKGIHELRSSTSALHHALEESASLLTMFWRAALPSTHIP 
VLPGKVGESTERELLELRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQF I VSQLTRTHDVLKKARTNLEVRKLLHQSEAPSLS PTHHHP 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDl^VyKLBTDFKMTLQQQSTLEQWAAWLDNVMMQALKPYEGRP "~ 
SFPKAARQFLLKWSFYRYHLGFS 


6902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSALEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


149 


R I NQ VYRQG PTG I HI LV I DQMVQN FQDES CFLFST V KAES S DG I 
HULK 


6904 


464 


2092 


MEASLPVSLSCVLACGDVEGKFDILFNRVQAIQKK6GNFDLLLC 
v vr * r r v» o i yu/ic* w 3t k i\jr x KKAF I QT YVLGANNQETVKYFQDA 
DG CELAEN I T YLGRKG I FTG S SGLQ I VYLSGTES LNEP VPG YS F 
oouru muti l 3W~l\.«VDJlijJjTSPWPKCVGNFGNSSGEVD 
TKKCGSALVSSLATGLKPRYHFAALEKTYYERLPYRNHIILQEN 
AQHATRFIALANVGNPEKKKYLYAFSIVPMKLMDAAELVKQPPD 
VTENPYRKSGQEASIGKQILAPVEESACQFFFDLNEKQGRKRSS 
TGRDS KS S PHPKQPRKPPQPPG PCWFCLAS PE VE KHL WNI GTH 
C YLALAKGGLS DDHVLILP IGHYQSWE LSAEWP P VP k vv h. tt 
RRFFKSRGKWCVVFERNYKSHHLQLQVI P VPISCSTTDDI KDAF 
ITQAQEQQIELLEIPEHSDIKQIAQPGAAYFYVELDTGEKliFHR 
IKKNFPLQFGREVLASEAILNVPDKSDWRQCQISKEDEETLARR 
FRKDFEPYDFTLDD 


6905 


1 


226 


VSKTGEAETITSHYLFALGVYRTLYLFNWIWRVHFEGFFDLIAI 
VAGLVQTVLYCDFFYLYITKVLKGKKLSLPA 


6906 


3 


6X1 


SYDDHNGHIDFITAASNLRAKMYSIEPADRFICTKRIAGKIIPAI 
ATTTATVSGLVALEMIKVTGGYPFEAYKNWFLNLAIPIWFTET 
TEVRKTKIRNGISFTIWDRWTVHGKEDFTLLDFINAVKEKYGIE 
PTMWQG V KML YVP VMPGHAKR L KLTMHKLVKPTTEKK YVDLT V 
S FAPD I DG DEDLPGP P VRYY FS H DTD 


6907 


2 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIMSRRSQRLTRYSQGDDDGS 
SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNMKRLSPAPQLGPSS 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLQSSSGYSSEDDYVGYSDVDQQSS 
SSRLRSAVS RAGSLLWMVATS PGRLFRLLYWWAGTTWYRLTTAA 
SLLDVFVLTRRFSSLKTFLWFLLPLLLLTCLTYGAWYFYPYGLQ 
TFHPALVSWWAAKDSRRADEGWEARDSSPHFQAEQRVMSRVHSL 
ERRLEALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHBD 
TLALLEGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDSE 
DLFKKIVRASQESEARIQQLKSEWQSMTQESFQBSSVKELRRLE 
DQIjAGIiG^EIAAliALKQSSVAEEVGLLPQQIQAVRDDVESQFPA 
MISQFLARGGGGRVGLLQREEMQAQLRELESKILTHVAEMQGKS 
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residue of 
amino acid 
sequence 


Predicted end 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, TVThreonine, V=Valine, 
W»Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AREAAASLSliTLQKEGVIGVTEEQVHHIVKQALQRYSEDRIGLA 
DYALESGGASVISTRCSETYETKTALLSLFGIPLWYHSQSPRVI 
LQ PD VH PGNCW AFQGPQG FAWRL S ARI RPTAVTLEHVPKALS P 
NS T I SSAPKDFAI FGFDEDLQQEGTLLGKFTYDQDGEP I QTFHF 
QAPTMATYQ WE LR I LTNWGHPE YTC I YRFRVHGEP AH 


6908 


3 


780 


QV P S AAWLMAVCGLGS RLG LG S R LGLQGC FGAARLL Y PR FQS RG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELVTEKGHTFAEELQKIQCTLQDV 
GSALATPCS S AREAHLKYTTFKAG P I LELEQWI DKYTSQLP PLT 
AFI LPSGGK I SS ALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLS DYL FTLARYAAM KEGNQE K I Y KKND P S AES EGL 


$909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
S PYLRGTI KMMQAVRQAFQDQDDRRT WDGRPLTMAATFDDCLYA 
LCWDTIKRSSQTGEWQNIAIMTEEPELSPAYLISEAMRRSRMS 
LYC 


6910 


1 


1068 


LVPVWIDSYYYGKLVIAPLNIVLYNIFTPHGPDLYGTEPWYFY 
LINGFLNFNVAFAIALLVLPLTSLMEYLLQRFHVQNIX3HPYWLT 
LAPMYIWFI I FFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWIoALGTVFLFGLLSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PSSFLLPDNWQLQFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 
NLEEPSRYIDISKCHYLVDLDTMRETPREPKYSSNKEEWISLAY 
R P FLDAS RS S KLLRAF YVP FLS DQ YTV YVNYT I LKPR KAKQ IRK 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANIiISIFGSSFSGLLRKSPGGGREEEEGEESG 
PEAAEPGQICCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 


AMKPVETHSFOMLFTILSTGSALKAQS YEDAYRCI KSS ILLGS I 
SGGTDIISCFMGHNFSLPVYKGBIOARNLGMAVEAWNEEGKAVW 
GESGELVCTKPIPCQPTHFWNDENGNKYRKAYFSKFPGIWAHGD 
YCRINPKTGGIVMLGRSDGTLNPNGVRFGSSEIYNIVESFEEVE 
DSLCVPQYNKYREERVILFLKMASGHAFQPDLVKR I RDAI RMGL 
SARHVPSLILETKGIPYTLNGKKVEVAVKQIIAGKAVEQGGAFS 
NP ETLDbYRD I PELQG F 


6913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSNfelfeiblLSMF^AIRSQHSGVDI 
CI^AGIJVRPDTLLSGSTSGWKDMFKVNVLALS I CTREAYQSMK 
ERNVDDGHIININSMSGHRVLPLSVTHFYSATKYAVTALTEGLR 
QELREAQTHI RATCI SPG WETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVI YVLSTPAH IQIGD IQMRPTEQVT 


6915 


254 


652 


GRSLS FKTFL I WVLiI S I YQGGI LMYGALVLFESEFVHWAI S FT 
AL I LTELLMVALT VRTWH WLMWAEFLSLGC YVS S LAFLN E Y FD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLSPPSYCKLAS 


6916 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHWAISFT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVS S LAFLNEYFD 
VAFI TT VTFL W KVS AITWS CLPL YVL KYLRR KLS P PS YCKLAS 


6917 


254 


652 


GRSLS FKTFLIVTVLIS I YQGGI LMYGALVLFESEFVHWAI S FT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVSSLAFLNEYFD 
VAFITTVTFLWKVSAI T WS CLPLYVLKYLRRKLS P PS YCKLAS 


6918 


28 


921 


PEAGTRSWRE PD PEDLRRFLLSAACRS FPQWLPGGGGGQVSSCS 
DTDVP YLLLAVKS E PGRFAERQAVRETWGS PAPG I RLLFLLGS P 
VGEAGPDLDSLVAWESRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRHCPTVSFVLRAQDDAFVHTPALLAHLRALPPASARSLYLGE 
VFTQAMPLRKPGGPFYVPESFFEGGYPAYASGGGYVIAGRLAPW 
LLRAAAR VAP F P F EDVYTGLC I RALGLVPQAHPG F LTAW PAD RT 
ADHCAFRNLLLVRPLGPQAS IRLWKQLQDPRLQC 
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Amino acid segment containing signal peptide 
{A= Alanine, C=Cysteine. D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 


6919 
6920 


850 
1418 


41 
591 


QGRKELSGSVFCPFIQQEPKEMLTLSEYHERVRSQGQQLQQLQA " 

ELDXIiHKEVSTVRAANSERVAKLVFQRLNEDFVRKPDYALSSVG 

ASIDLQKTSHDYADRNTAYFWNRFS FWNYARPPTVILEPHVFPG 

Na^AFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 

RDFAVFFLLSFFTHQGLQVYDETEVSLGKFTFDVBKSEIQTFHIi 

QNDP PAAFP KVKI Q I LSNWGH P RFTCLYR VRAHG VRTS EG AEG S 

AQGPH 

EAQGPSKVHLTLKKKK 


6921 " 


2 


1711 


MN ATRS E EQ FH VINHAEQTLR KMENYLKE KQLCDVLL I AGHLR I " 

PAHRLVLS AVS DY FAAM FTNDVLEAKQEBVRMEG VDPNALNS L V 

QYAYTGVLQLKEDTIESLLAAACLLQLTQVI DVCSNFLI KQI^HP 

SNCLGIRSFGDAQGCTELIiNVAHKYTMEHFIEVIKNQEFLLLPA 

NE I S KLLCSDD INVPDEETI FHALMQWVGHDVQNRQGELGMLLS 

YIRLPLLPPQLLADLETSSMFTGDLECQKLLMEAMKYHLLPERR 

SMMQSpRTKPRKSTVGAJjYAVGGMDAMKGTTTIEICYDLRTNSWL 

HIGTMNGRRLQFGVAVIDNKLYWGGRDGLKTLNTVECFNPVGK 

IWTVMPPMSTHRHGLGVATLEGPMYAVGGHDGWSYLNTVERWDP 

EGRQWNYVASMSTPRSTVGWALNNKLYAIGGRDGSSCLKSMEY 

FDPHTNKWSLCAPMSKRRGGVOVATYNGFLYWGGHDAPASNHC 

SRLSDCVERYDPKGD3W3TVAPI>SVPRDAVAVCPLGDKLYWGG 

YDGHTYLNTVESYDAQRNEWKEEVPVNIGRAGACWWKLP 


6922 


1075 


369 


LTPPAGIRHEVRDRBRERERERKREKFPLDSTGSELKQNlfl§lT~ 
GL P PAMQKVM YKGLAPEDKTLRE I KVTSGAKI MGGGSTINDVLA 
VNTPKDAAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 
KGAQERLPTVPLSGMYNKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGS I KNWS E P I EGHED YHMMAFQLGPTEAS YYWVYWVPTQY 
VDAI KDTVLGKWQYF 


6923 


2469 


1660 


LGL F C I LP 1 DTLC AVLE RDTLS I RE S R bFGA WR WAEAE CQRQQ " 
LPVTFGNKQKVLGKALSLIRFPLMTIEEFAAGPAQSGILSDREV 
VNLFLH FTVN PKPR VE YI DRPRCCLRGKECCINRFQQ VES R WG Y 
SGTSDRIRFTVNRRISIVGFGLYGSIHGPTDYQVNIQIIEYEKK 
QTIiGQNDTGF S CDGTANT FRVMF KE P I E I L PN VC YTACATL KG P 

DSHYGTKGLKKWHETPAASKTVFFFFSSPGNNNGTSIEDGQIP 
EIIFYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGALAKKPYNPI IGETFHCSWEVP " 

KDRVKPKRTASRSPASCHEHPMADDPSKSYKLRFVAEQVSHHPP 

ISCFYCECEBKRLCVNTHVWTKSKFMGMSVGVSMIGEGVLRLLE 

HGEEYVFTLPSAYARSILTIPWVELGGKVSINCAKTGYSATVIF 

HTKPFYGGKVHRVTAEVKHNPTNTIVCKAHGEWNGTLEFTYNNG 

ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVrRYLRLGDI 

DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 

SPLESTLMGLEVQSFPV 


6925 
6926 


2 
1 


1653 
733 


RGGAAGAAMBPDSVIEDKTIELMCSVPRSLWLGCANLVESMCAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWSESDQVEFVEHLISRMCHYOHGHINSYLKPMT.nRnPTTAT d 

EQGLDHIAENILSYLDARSLCAAELVCKEWQRVISEGMLWKKLI 
ERMVRTDPLWKGLS ERRGWDQ YLF KNRPTDG P PNS F YRS L Y P K I 
IQDIETIESNWRCGRHNLQRIQCRSENSKGVYCLQYDDEKIISG 
LRDNSIKIWDKTSLECLKVLTGHTGSVLCLQYDERVIVTGSSDS 
TVR VW D VNTG E VLNT L I HHN EA VLH L R FSNG LMVT C S KDRS I A V 
WDMASATDITLRRVLVGHRAAVNWDFDDKYIVSASGDRTIKVW 
STSTC E FVRTLNGHKRGI ACLQ YRDRL WSGSSDNT I R LWD I EC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKIKVWDLQAALDP 
RAPASTLCLRTLVEHSGRVFRLQFDEFQIISSSHDDTILIWDFL 
NVPPSAQNETRS PSRTYTYISR 

SGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPL 
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6^27 






DGYPLPTPUTSPLDUVDPDPAFFAAPMVGUCPAAGTYSYAQVSD 1 
YAGPPEPPAGPMHPRLGPEPAGPSI PGLLAPPSALHVYYGAMGS 
PG AGGGRGFQMQ PQHQHQHQHQHH P PG PGQPTP PPEALPCREX5T 

DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHGAISSWSDASSAVYYCNYPDV f 




2 


1484 


CQG FA WATDLS TDLESQLSVS CKC YEAANHI LQFRDLKS QNPEH 
YVQVL KRMGN I RNE I G VF YMNQAAALQSERLVS KS VS AAEQQLW 

x ^ ^ *a v.r oRVjittw rtii bUA I NAAJjLLCNTGRLMRICAQAHCGA 1 
GDEI.KRBFSPEEGLYYNKAIDYYLKALRSLGTRDIHPAVWDSVN 
WBLSTTYFTMATLQQDYAPLSRKAQEQIEKBVSEAMMKSLKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLH YS KAAKLFQLL KDAPCELLRVQLER VAFAEFQMTS QNS 
NVGKLKTLSGALDIMVRTEHAFQLIQKELIEEFGQPKSGDAAAA 
ADASPSLNREEVMKLLSIFESRLSFLLLQSIKLLSSTKKKTSNN 

IEDDTILKTNKHIYSQJjLRATANKTATLLERINVIVHLLGQLAA 
GSAASSNAVQ " | 


6928 


1086 


777 


t^ium mnLiLjU V KlnKKR *s VDKTLSHPWLQDYQTWLDLRELECK 

igeryithesddlrwekyageqglqypthlinpsashsdtpete 
etemkalgervsil 


6929 
6930 


1749 


607 


rdqrgyrddksparepgdvsartrsgggggrsattampppvpngH 

im ^riy HDFQDIiRHNGNVVVAGRPS CS RG PRRAIQ K PQ P AGGRRSG 

rgpaagglcioppdggtcvpeeppvppmdwealekhlaglqpre 
qevrnqgqartnstsaqkneresirqklalgsffddgpgiytsc 
sksgkpsi^srlqsgmnlqicfvndsgsdkdsdaddsktetsld 
tplspmskqsssysdrdtteeeseslddmdfltrqkklqaeakm 

/iurtii/vis.ffTft v fc, v& KQNR K KS P VADLLPHM PH I S ECLM KRS L 

kptdlrdmtigqlqvivhdlhsqieslneelvqlllirdelhte 
qdamlvdiedltrhaesqqkhmaekmpak 1 




131 


545 


fkdtanvfvslfqmrnnfrhyfiepsqlklfydvitwiVtqvai 
sytwpfvllsikpsltfysswyyclhilgilvllllpvkktqr 
rkntheniqlsqskkfdegenslgqnsfsttnnvcnqnqeiasr 

HSSLKQ | 


6931 


2 


659 


FVE R LPN R P AUIiL VASGAAEG VSAQS FLHCFTMAS TAFNLQVAT 
PGGKAMEFVDVTESNARWVQDFRLKAYASPAKLES IDGARYHAL 

A ^ rvjU4iJ MabLAR I LQH FHS E S KP I CAVGHGVAALCC I 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVKDSG 
ACFSASEPDAVHWLDRHLVTGQNASSTVPAVQNLLFLCGSRK 1 


6932 


2 


1131 


FVDSPGQGEUAEEEEGGIQMNSRMRAHSPAEGASVESSSPGPKX | 

SDMCEGCRSLAAGHPGYISHDKETSIKYVSHQHPSHPQLFSIVR 
QACVRSLSCEVCPGREGPTPPflnt , niTf3i?ve'CLrrDc»Tv^or 1 

QR WYS 1 1 TIMMDR I YL I NS WPFLLGKVRG I IDELQG KALKVFEA 
EQFGCPQRAQRKNTAFTPFIiHQRNGNAARSLTSLTSDDNLWACL 
HTS F AW LLKACGS RLTE KLLEGA PTEDTbVQMEKLADLE EES ES 
WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTL 
KTLQEVTDSLLGGWLMAQGVGGI I 


6933 


1431 


B90 


SLNLHCTLPPPPHQYPAGYPSDKBGKKPKGQSKKQPSGTTKRPI | 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIREDCQNQKLW 
DEVLSHLVEGPNFLKKLEQSFMCVCCQELVYQPVTTECFHNVCK 
DCLQ RS FKAQ V FS CP ACRHDLGQN Y I M I PNE I LQTLLDLFFPG Y 
SKGR 


6934 


3030 


2588 


DRDHSQCGGlRRVAIiARVSSVKLISKAKIRTVKMTFlIVLAFiV \ 
CWTPFFFVQMWSVWDANAPKBASAFI I VMbLASI^SCCNPWIYM 

LFTGHLFHEIiVQRFLCCSASYLKGRRLGETSASKKSNSSSFVLS 
HRSSSQRSCSQPSTA | 
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\=possible nucleotide insertion) 


6935 


866 


543 


NSALYVAGGNDGTSCLNS VERYS PKAGAWES VAPMN I RRSTHDL 
VAMDGWLYAVGGNDGSSSLNSIEKYNPRTNKWVAASCMFTRKSS 
VGVAVLELLNFPPPSSPTLSVSSTSL 


6936 


1347 


567 


RSHRRQFLSRALLEFFGKSHPPPHRLFRKSLNVGLHYSHIPFLT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQI»YKEEGNQRYREGKYRDAVSRYHRAI*IjQLRGIjDPS 
LPS PLPNLGPQGPALTPEQEN I IiHTTQTDCYNNLAACLLQMBP V 
NYERVREYSQKVIiERQPDNAKALYRAGVAFFHLQDYDQARHYIiL 
AAVNROPKDANVHRYT.OT.Tn<?FT C QYPPVFirnT vt r-Mirr' 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDV 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
P CP PLEE RAGCLE YS T PQGQDCGHT YVPA F I TTSAFNKERTRQA 
TSPHWSTHT3DAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVCVDCQP PAMNS VS LRCSGDGLDSDGNQTLHWQ A I GNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


r* iKALiii iiA£.K v DT u FMQ LKKRRQS S E KEN DSGTLDT VG A WVDH 
EGNVAAAVS SGGLALKH PGRVGQAALYGCGCWAENTGAHNP YS T 
AVSTSGCX5EHLVRTILARECSHALQAEDAHQALLETMQNKF1SS 
P FLAS EDG VLGGV I VLRS CR CS AE P DSS QN KQTLLVE FLWS HTT 
ESMCVGYMSAQDGKAXTHISRLPPGAVAGQSVAIEGGVCRLGEP 
S ELTLQAECEASQRHFRT 


6939 


3 


810 


jvv i Afioif x SStiHti SDNbS V.L5GEL.P PAMGRTALFHHSGGSS 
G YESLRRDS EATGSAS S APDSMSESGAAS PGARTRS LKS P KKRA 
TGLQRRRL I P APLPDTTALGRKPS L PGQWVDL PPPLAGS L KEP F 
E I KV YE I DDVERI£R PR PTPRE APTQGLACVSTRLR LAERRQQR 
LRE VQAKHKHLCEELAETQGRLMLE PGRWLEQFEVDPELE PES A 
EYLAALERATAALEQCVNLCKAHVMMVTCFDISVAASAAIPGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 
v v iwo o r iaj r M\j3LAjj\z, e, fJ\H\j r v? Jj rbW by PEKCAVFQCAQ CHAV 
LADSVHLAWDLSRSLGAWFSRVTNNWLEAPFr..VGIEGSLKGS 
TYNLL FCGS CG I P VG FHLYSTHAAIAALRGH FCLSS DKMVCYLL 

KTKAIVNASEMDIQNVPLSEKIAELXEK1VLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI IGSNVLALAEAQRQAEALGYQA 
WLSAAMOGD VKSMAO PYGTA AHVAP TR T.TPQMana c\rc t? n nrvr 
H ELAAE LQ I P DLQLEE ALETMAWGRG P VCLLAGG EPTVQLQGSG 
RGGRNQELALRVGAELRRWPLGPIDVLFLSGGTDGODGPTEAAG 
AWVT PELASQ A A AEGLDI ATFLAHNDSHT F FCCLQGG AHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCLLGDRLYADGGYDG 
QTYLNTMESYBPQTNEWTQMASIjNIGRAGACVVVI KQP 


6943 


1 


739 


PMATGDGAKTLAIHVKALTADSI RITWKATLPASS FRLSWLRLG 
HS PAGGS ITETLVQGDKTEYLLTALEPKPTYI I CMVTMETTNA Y 
VADETPVCAKAETADS YGPTTTLNQEQNAG PMASLPLAG 1 1 GGA 
VALVFLFLVLGAICWYVHQAGELLTRERAYNRGSRKKDDYMESG 
TKKDNS ILEIRGPGLQMLPINPYRAKEEYWHTIFPSNGSSLCK 
ATHTIGYGTTRGYRDGGI PDIDYSYT 


6944 


960 ' 


156 


VANI LLNGVKYESELTGS SERAEQPLS VGRLCSTI CNMPKALRT 
LCVIfflFLGWLSFEGMLLFYTDFMGEVVFOGDPKAPHTSEAYOKY 
NSGVTMGCWGMCIYAFSAAFYSAILEKLEEFLSVRTLYFIAYLA 
FGLGTGLATLSRNLYWLS LCITYG ILFSTLCTLPYSLLCD YYQ 
S KKFAGSSADGTRRGMG VD I S LLS CQ Y FLAQ I LVS LVLG PLTSA 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 
LNV 
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Glutamic Acid, ^Phenylalanine, G^Glycine, 
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W=Tryptophan. Y= Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6945 


2067 


179 


EG EDRGLPRTMGAAIiG TGTRLAP W PGRACGAL PR WT PTAPAQGC 
HS K PG PAR P VP LKKRG YDVTKNPHLNKGMAPTLEE RLQ LG I HGL 
I PPCFLSQDVQLLRlMRYYERQQSDLDKyi I LMTLQDRNE KLFY 
k v ii i s» uv t m p i VYT PTVGLACQH YGLTFRR P RG LF I T I HD KG 
HIATMLNSWPEDNIKAVVVTDGERILGLGDLGCYGMGIPVGKLA 
LYTACGGVNPQQCLPVLLDVGTNNEELLRDPLYIGLKHQRVHGK 
AYDDLLDEFMOAVTDKFGXNCLIQFEDFANANAFRLLNKYRNKY 
CM FNDDI QGTAS VAVAG I LAALR I TKN KLSNHVFG FQGAGEAAM 
G\IAHLLVMALE\KEGVPKA\EATRKIW\MVDF\KGLIVQGRDH 
LNHEKEMFAQD\HPE VNSLEEWRLVKPTA1 IGVAA I AEA\ FTE 
QI LRDMASFHERP\ 1 1 FALSNPTS KAECTA\E KCYRVTEG PRGF 
FAS\GSPF*GVLIWEMGKTFIPGGRGNNA+RVPRGWQLGVHSPG 
GDPGH I P\DE I FLPDSRAKLPQEVS EQHI*SQGRLYP\ PLST \ IR 
NVFLRIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPG8YTPD 
YDS FYT/VDS YI WAQGKAMNVQTV 


6946 


133 


2551 


S CE YSGI T VA PGDPCPGVAHLLiAPS MASDTPES LMALCTD FCLR 
NLDGTU3YLLDKETLRLHPDIFLPSEI\CDRLVNEYVEIiVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\LVQD\QD\LE 

airkqdl\vel\yltn\ceklsakslqtlrsfsht'lgvp+affg 
c\tnilllrkenpggl/cedeylfnptcqvlvkdftfegfsrlr 

F\LKLGRMIDWVPVES \LLRPLNSIAALDLSGIQTSDAA\ FLTQ 
WKDSL\VSLVL\YNMDLSDDIIIR\VIVQLHKLRHIJDISRDRLSS 
YYKFKLTREVLSLFVQKLGNLMSLDISG\HMILENCSISKIGKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 
I PAYKVSGDKNEEQVLNAI E AYTEHRPEITSRAINLLFD I AR I E 
RCNQLLRALKLVITALKCHKYDRN1QVTGSAALFYLTNSEYRSE 
QS VKLRRQVI Q WI>NGMES YQE VTVQRNCCLTLCNFS I PE ELEF 
QYRR VNE LLLS I LN PTRQDE S I QR I AVHLCNALVCQVDNDH KEA 
VGKMGFVVTMLKLTQKKLLDKTCI)QVWEFSW\SAXiWNITDETPD 
NCEMFI^FNGMKLFLDCLNEFPEKQELHRNMLGLLGNVAEVKEL 
RPQLMTSQFISVFSNLLESKADGIEVSYNACGVLSHIMFDGPEA 
WGVCEPQREEVEERMWAAIQSWDINSRRNINYRSFEPILRLLPQ 
GISPVSQHWATWALYNLVSVYPDKYCPLLIKEGGMPLLRDIIKM 
ATARQETKEMARKVI EHCSNFKEENMDTSR 


6947 


2 


1*82 


TS VS T I PRGliAS ARPQS RS WRCCP VWRRS PGRARGRGLKM LNVP 
SQSFPAPRSQQRVASGGRSKVPLKQGRSLMDWIRLTKSGKDLTG 
LKGRL I EVTE EELKKHNKKDDCWI C I RGFVYNVS P YM E YHPGGE 
DELMRAAGSDGTELFDQVHRWVNYESMLKECLVGRMAI KPAVLK 
D YREEE KKVLNGML PKSQ VTDTLAKEG PS Y PS YDW FQTDS LVT I 
/EHIY*TEGYQFRLNNS*SSE*FLYSRWWY*GLLISYTYW/R*A 
MRFRKIFLCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSL 
I PRKDTGL YYRKCQLI S KED VTHDTRL FCLMLPPS THLQ VP IGQ 

W\/VT. VT .D Y*TY""T , 'C T T/t/nvrrriTrnnn t v nnnvnmiT n»_.»._ 

riv i LiivLttrx VKPYXPVSGSLLSEFKEPVLPNNKYIYFLIK 
IYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQELEDLFLLA 
AGTGFTPMVKILNYALTDI PSLRKVKLMFFNKTEDDI I WRSQLE 
KLAFKDKRLDVEFVLSAPISEWNGKQGHISPALLSEFLKRNLDK 
SKVLVCI CGPVPFTEOGVRLLHDLNFSKNEIHS FTA 


6948 


104 


58 


PDGAHSFFPDEYFTCSSLCLSCGVGCKKSMNHGKEGVPHEAKSR 
CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWP4GLAKY 
AWSGYVIECPNCGWYRSRQYWFGNQDPVDTWRTEIVHVWPGT 
DGFLKDNNNAAQRLliDGMNFMAQS VS ELSLGPTKAVTSWLTDQ I 
AP A Y WRPNS Q I LS CNKCATS FKDNDTKHHCRACGEG FCDSCS S K 

TRPVPERGWGPAPVRVCDNCYEAR/TRPVSCYRGTSGR*RRRRT 
QETVE 


6949 


152 


465^ 


GLRLCLSRPLTRPGDDSVGGSAMASGAGGVGGGGGGKIRTRRCH I 
QGPIKPYQQGRQQHQGILSRVTESVKNIVPGWLQRYFNKNEDVC | 
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Codon, /^possible nucleotide deletion, [ 
\=possible nucleotide insertion) 


6950 






acSTDTSEVPRWPENKEDHLVYAPEESSNlTDGRITPEPAVSNT ~ 

EEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 

SAFPIGSSGFSLVKBIKDSTSQHDDDNISTTSGPSSRASDKDIT 

VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 

S LGNS S I LKTSQI/3DS P F Y PG KTT YGGAAAAVRQS KLRNTP YQA 

PVRRQMKAKQLSAQS YG VTS STARR I LQSLEKMSS PLADAKRI P 

SIVSSPLNSPLDRSGIDITDFQAKREKVDSQYPPVQRLMTPKPV 

SIATNR5VYFKPSLTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 

REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 

LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 

QALTNKVQMTSPSSTGSPMFKFSSPIVKSTEANVLPPSSIGFTF 

SVPVAKTABLSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 

EGPFRPAEILKEGSVLDILKSPGFASPKIDSVAAQPTATSPWY 

TRPAISSFSSSGIGFGESLKAGSSWQCDTCLLQNKVTDNKCIAC 

QAAKLSPRDTAKOTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 

WDCDTCLVQNKPEAI KCVACETPKPGTCVKRALTLTVvs ESAET 

MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 

VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 

ELCLVQN KADST KCLACE S AKPGTKSGFKG FDTS S S SSNS AAS S 

SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP 

MSEGF*FSKHIVGFKFGVSSESKPEEVKKDSKNDNFKFGLSFGL 

SNPVFLTPFQFGVSNLGQEEKKEELLK33CAGFRFGTGV1NSTR 

VPANT I VTSENKSS FNLGTI ETKS VSVAPLKCQTSEAKKEEMPA 

TKGGFSFGNVEPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 

KLTMKEPKC\QPVFSPGEFQRQTKDENSSKSTFSFSMTKPSEKE 

SEQPAKATFAFGAQTNTTADQGAAKPDLSYLNNSSSSSSTPATS 

AGGG \ I FGS STSS SN PP VATFVFGQS S N PGS S S \ AFGNTAES ST 

SQSLLFSQDSKLATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 

ATTTSSSAGSSFVFGTGPSAPSASPAFGANQTPTFGQSQGASQP 

NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 

SAFGSGTTPNSSSAFQFGSSTTNFNFTNNSPSGVFTFGANSSTP 

AASAQPSGSGGPPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRK 


6951 


2585 


411 


FKPGSRSGLCKRAGERGAVkAGGliSRRTRAE * XMDELHYQDTDS ' 
DVPEQRDSKCKVKWTHEEDEQLRALVRQFGQQDWKFLASHFPNR 
TDQQCQYRWLRVLNPDLVKGPWTKEEDQKVXELVKKYGTKQWTL 
IAKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI ICEAHKV 
LGNRWAE I AKMLPGRTDNAVKNHWNSTI KRKVDTGGFLS E5KDC 
*****u«"»j-'iUA»iiyo/tyt p 1 £»uyubIjjjTNWPSVPPTIKEEEN 
SEEELAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 
ETSLP YKWWEAANL L I PA VGS S LSEALDL I ES DPDAWCDLS K F 

DLPEEPSAEDSINNSLVQLOASHQQQVLPPRQPSA\LVPSVTEY 
RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 
R R VALS P VTENSTS LS FLDS CNS LTP KS TP VKTLPFS PS QFLN F 
WNKQDTLE LES PS LTS T P VCS Q KVWTTPLHRDKTP LHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLEEDLKE 
VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 
DEDMKLMMSTLPKSLS LPTTAPSNSS SLTLSG I KEDNSLLNQGF 
LQAKPEKAAVAQ KPRSHFTTPAPMS S A WKTVACGGTRDQL FMQ E 
KARQLLGRLKPSHTSRTLILS 




1940 


239 

) 
] 


AGPDDTMKRSLQALYCQLLSFLLILALTEALAFAiaEPSPRESL 
QVLPSGTPPGTMVTAPHSSTRHTSWMLTPNPDGPPSOAAAPMA 

IPTPRAEGHPPT\TPS ppslrq* pppilkap /sstgpapaamat 

rs SKPEGRPRGQAAPT I LLTKPPGATS RPTTAP PRTTTRR PPRP 
PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
[jGKIFQIYKGMFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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Amino acid segment containing signal pepTIdT" 
(A-Alanine, C=Cysteine, D=>Aspartic Acid, E= 
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W=Tryptophan, Y-Tyrosine, X*Unknovn, *=stop 
Codon, /-possible nucleotide deletion, 
\=posaible nucleotide insertion) 








TVPSNTS WAFl'A^SLGPAKDKPGI^lRAAQGGGSTFTS QGGTPDA ~ 

TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP+*LIAYCYP\CT 

SRPLSTSSGVFTAATGPTPAAFDTSVSAPSQGIPQGASTrPOAP 

THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPQWYPQPQAIS 

STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 

WPSACPSPP\LCPADGVLHEEEEEDRQPGEQPEAYGNNTHHPGT 

TFQQAC\RGAAPGEIPVPLKPLRTQLSEPRSPANGDYRDTGMVP 


6953 


658 


304 


4 i -L"<3WiJ&nuvatv.xityi 1 \ATPTPPSGSG\CE 
PTPRLVLLLHGPLRPSQLLRHCGE*EQSASPLQLDGKDASALWT 
ASRQARGELRLCLTTAVRGTSPS VS PVCOSS 


6954 


1512 


349 


^WGKTRAi^GKHVPFGkQ'I-NPNKS/ VHCDS » G * » RRE TTQDBS 
PS PHFRGKMGG W\ KIiEKELENTEQPVGGNEG * EHEVTGNLNS D 
PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 
GRALSSPGSLGRHLLIHSSDQRSNCAVCGARFTSHATFNSEKLP 
« y jji^i iooi»r i vniv&KjVbti AfcXjKDI AFSPPVYPAGILLVCNNCAA 

YRKLtiEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 
ET P E EREVRRMRDRE AKRLQRMQETDEQRAR RLQR DREAMRLKR 
AIETPEKRQARLIREREAKRLKRRLEKMDKMLRAQFGQDPSAMA 
ALAAEMNFFQLPVSGVELDSQLLGKMAFEEONSSSr.H 


6955 


819 


1 


PPPPFIIPSHPRSAGT*AG*KRSGDSECSPPVKQ*A*TRAAAQN" 

* PQR*RWTEGNS PQAS AVATPGQGAS PAA PR CT P * PSRRHRRLP 

PGARPPAG*AAPAPTKPWLAGPASAPQPGAAPLSPPAPPLIRTR 

* CAG AAARGR PRRDRS PR PRTPGGCS WS EPRTPPAVSASAQTPS 
DAG*AGGR*GQRQRPSTGR+PPGVGGAGRSHRREGTIPGNPHPR 

AS*RAGWOR*PRD /DFUrr +fn/VDn»»nnnn/, n — 

*^*V KbW v>lj*EPQGEEMSGPGGPGGAPPNOVGSS 

VMQAMSTGI 


6956 


1968 


782 


FFUHRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRTMPFLiGQD 
WRSPGWSWIKTEDGWKRCESCSQKLERENNHCNISHSIILNSED 
GEIFNNEEHEYASKKRKKDHFRNDTNTQSFYREKW1YVHKESTK 
ERHGYCTLGEAFNRLDFSSAIQDIRRFNYWKLLQLIAKSQLTS 
LSGVAQKNYFNILDKIVQKVLDDHHNPRLIKDLLQDLSSTLCIL 
/N * R S RE VC ISG KHQ YLDLP I RN YSRLATTATGS S DD * AS E \NG 
LTLS DLPLHMLNN I L YR PS DGWDI ITLG QVT PTL YMLSEDRQLW 
KKLCQYHFAEKQFCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 
GDTIiHFCR HCS I L FWKDSGH PCTAADPDS C FTP VS PQH F I DI»FK 
F 




8605 


3839 

] 
1 
J 
i 
i 


QTSTSI FAS PI'S PPVIjGESVIiQDNSFDLNNGSDAEQEEMETQSS 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 
PEXSPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 

tspkaspvtspaaafptaspankdvssflettadveeitgeglt 
asgsgdvmrrriatpeevrlplqhgwrrevrikkgshrwqgetw 

VYGPCGKRMKQFPEVI KYLSRNWHS VRREHFSFS PRMP VGDFF 
EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 
PKVKRGRGRPPKVKITELLNKTDNRPLKKLEAQETLNEEDKAKT 
AKSKKKMRQKVQRGECQTTIQGQARNKRKQETKSLKQKEAKKKS 
KAEKEKGKTKQEKLKEKVKRBKKEKVKMKEKEEVTKAKPACKAD 
KTLATQRRLEERQRQQMILEEMKKPTEDMCLTDHQPLPDFSRVP 
3LTLPSGAFSDCLTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 

G cqgds lge vqdll vrllkaalh dpgfpsycqslkilgekvsei 

PLTRDNVSEILRCFLMAYGVEPALCDRLRTQPFOAQPPQQKAAV 

uaflvhelngstliineidktlesmssyrknkwivegrlrrlkt 

/LAKRTGRSEVEMEGPEECLGRRRSSR1MEVTSGMEBEEEEES1 

\avpgrrgrrdgevdatassipelerqieklskrqlffrkkllh 
?sqmlravslgqdryrrrywvlpylagifvegtegnlvpeevik 

CETDSLKVAAHASLNPALFSMKMELAGSNTTASS PARARGSPRtr 
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beginning 
nucleotide 
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to first 
amino acid 
residue of 
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sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LxLeucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X= Unknown. +=stop 
Codon, /=possible nucleotide deletion, 
\=posaible nucleotide insertion) 


6957 






TKPGSMQPKHbKSPVRGQDSEQPQAQLQPEAQLHAPAQPQPQLQ 
LQLQSHKGFIiEQEGSPLSIiGQSQHDLSQSAFLSWLSQTQSHSSL 
LSSSVLTPDSSPGKLDPAPSQPPEEPBPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSPQQLASSKPMNRPSAANPCS 
PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 

FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALKPR 

1 GIREKALiHKHTiNTJf WPHPT jTM2\rm Dnc>r\nTpnrM-^, 

j WA "^ lWA ^*^" w *vtmuiri^BVCltKPbADPIFEPRQLPAPQEGIM 
S WS PKE KTYE TDIiAVLQW VEBLEQR V 1 MS DLQ I RGWTCPS PDST 
REDLAYCKHLSDSQED I TWRGRG REGLA PQR KTTNPLDLAVMRL 
AALEQNVERRYLREPLWPTHEWLEKALLSTPNGAPEGTTTEIS 
Y E I T PR I RVWRQTLE RCR S AAQ VCLCLGQLERS I AWE KS VNKVT 
CLVCRKGDNDEFLLLCDGCDRGCHIYCHRPKMEAVPEGDWFCTV 
CLAQQVEGEFTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVLLR 
GRESPAAGPRYSEEGLSPSKRRRLSMRNHHSDLTFCEI XLMEME 
SHDAAWPFLEPVNPRLVSGYRRHKNPMDFSTMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRFFE\SRWEEF 
YC^KQGQSVRQ^RWGVTIiWHLPPTFQTKTCHFHLLMLPWVQTQV 


6958 


82 


3S14 


HLIVAMPEPVKKEENBVPAPAPPPEEPSKEKEAGTTPAI^WfLV" 

ETPPGEEQAXQNANSQLSILFIEKPQGGTViCVGEDITFIAKVKA 

EDLSEKPTINGSRKWMDLASKAGKHLQLKETFERHSRVYTFEMQ 

1 1 KAKDNFAGNYRCEVT YKDKFDSCS FDLEVHESTGTT PN I D I R 

SAFKRSGEGQEDAGELDFSGLLKRREVKQQEEEPQVDVWELLKN 

TKPSEYEKIAFOYESPTCSGMLKRLKRSIRBEKKSAAFAKILDP 

VYQVDKGGRVRFWELADPKLEVKWNKNGQELRPSTKYI FEDTR 

CQS I LNI DNCQMTDDSE Y YVTAGDE KCSTELLVREPP I MVTKQL 

EDTTDYCGERVELECEVSEDDAQVKWFKNGEEIILVQTRYRIRV 

EGKKHILIIEGATKADAADYSVMTTGGQSSAKLSVDLKPLKILT 

PLTDQTVNLGKEICLKCEISENIPGKWTKNGLPVQESDR1»KWH 

KGRIHKLVIDHALTEDEGDYVFAPDAYNVTLPAKVHVIDPPKI I 

LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 

SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 

x *" r-i^r-x- vt\tr i v l a vbuuwi lMNWEPPAYDGGSPILGYFIE 

RKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYEVRIFAVNA\I 
GISKPSMPSRPFVPLAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGLDGYVLEYCFEGSTSAKQSDENGEAAYDliPAEDWlVANKD 
H DKTKFTI TGL PTDAK I F VR VKAVN AAGAS E p K YYS Q P I L VKE 

IIEPPKIHSPKHLKQTYlRRVGDRVItiVIPFQGKPRPEljTWKKD 
GAEIDKNQINIRNSETDTI I FIRKAERSHSGKYDLQVKVDKFVE 
TASIDIRIIDRPGPPQIVKIEDVWGRNVALTWTPPXDDGNAAIT 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELV1GNEYYFRVFSEN 
MCGLSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 
VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLEIRKPSPYDGGTYCCKAVNDLGTVEIECKLEVKVIAO 


5959 


274 
1 


1663 

1469 [; 


PRTSRVKTEGSQGSSAMDFSVKVDIEKEVTCPICLELLTEPLSL 
DCGH5FCQACITAKIKESVIISRGESSCPVCQTRFQPGNLRPNR 
HLAN1 VERVKEVKMS PQEGQKRDVCEHHGKKLQI FCKEDGKVI C 
WVCE LS QEHQGHQT FR INE WKECQE KLQVALQRL I KENQEAE K 
LEDDIRQERTAWKNYIQIBRQKILKGFNEMRVILDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVOQRQDASTLISDLQRRLRGSSVEM 
LQDVIDVMKRSESWTLKKPKSVSKKLKSVFRVPDLSGMLQVLKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSLWK 
RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTLFMAV\LPWLGFS 

^VHWEFGRGIBDFPYLFFOLTHCOQRICSVTOAGVQWCDHSS 
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amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=>Valine, 
WaTryptophan, Y=Tyrosine, X*Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQPQTPGLNQSSHLSLbSSRDYRMLSSFNBWFWQDRFWLPPt^VT 
WTEI£DRIX3RVYPHPQDLU^PL^VIiLAMRLAFERFIGLPLS 
R WLGVRDQTRRQ VKPNATLEKHFLTEGHR PKEPQLS LLAAQCGL 
TLQQTQRWFRRRRNQDRPQLTKKFCEAS WRFLFYLS S FVGGLSV 
LYHESWLWAPVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLBLa 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVILMTFSYSANLIiRIGSLVLLLHDSSDYLLEACKMVNYMQ 
YQ Q VCDAL F L I FS FV FFYTRLVL F PTQ I L YTTYYES I S NRG PF F 
GYYFFTfGLLMLIjQLLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWARE KE MQE F \TRS FF \ RGR P DLSTLTHS I VRRRY LAHSG RS 
«*" i\y/iLfiVKii v fib rii'tiKMy VL/iiAAS R EDKLDLTKKG KRP PT 
PCSDPERKRFRFNSESESGSEASSPDYFGPPAKNGVASRSHTHP 
KEENPRRA\SKAVEESSDEERQRDLPAQRGEESSEEEEKGYKGK 
TRKKP WKKQAPGKAS VSRKQAR EESE ESEAEP VQRTAKKVEGN 
KGTKSLKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 
PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 
DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGS RXMARLG S TSGEES DLER E VS DS EAGGG PQGERKNRS SKKS 
SRKGRTRSSS SSSDGS PEAKGG KAGSGRRGEDHPAVMRLKRYIR 
ACGA1IRNYKKLLGSCCSHKERLSILRAELEALGMKGTPSLGKCR 
m cy lUti&AAJS v Ao L>u V AN 1 1 SGS GRPRRRTAWNPLGEAAPPGB 
LYRRTLDSDEERPRPAPPDWSHMRGI ISSDGESN 


6961 


340 


1646 


RPWSSPTMKPNFSLRLRIFNi^CWGIPYLSKHRADRMRRLGDFL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGIIGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGHiVLHL 
SGMVLNAYVTHLHAE YNRQKD I Y IiAHR VAQAWELAQF I HHTS KK 
ADWLLCGDIiNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVP KNCYVS QQELKP F PFGVR I D YVLYKAVSG FY I S CKS FET 
TTGFDPHRGTP LSDHEALMATLFVRHS P PQQNPSS THG P \ AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\Lr. 
LA LL CVLAAGGGAG E AA I L T.WT P Q VC3 T ,VT >W AO, A tpvt vmrrwnnsn 

LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6962 


340 


1646 


RPWS S PTMKPNFSLRLR I FNLNCWG I P YLSKHRADRMRRLGDFI* 
NQESPDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGXIGSG 
LCVFSKHPIQELTQHIYTLKGYPYMtHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTH LHAE YNRQKDI YLAH R VAQAWELAQF I HHTS KK 
ADWLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL / MC VCLKE ALDGS LGLGMA\ Q AR W WA\ TFA\ S YVIGLGL \ LL 
LALLCVLiAAGGGAGEAAI LL WTPS VG LVLWAGAF YLFHVQE VNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6963 


374 


2618 


RVTPL I LKLLKKPKTAENQKAS EENE I TQ PGGSS AK PGL PCLNF 
EAVLSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTERIKSIN 
LHNFSNS VLETLNEQRNRGH FCDVTVR IHGSMLRAQRCVLAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIBFMYSGVLRVSQSEA 
LQI LTAAS I LQ I KTVIDECTRI VSQNVGDVF PG I QDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQQMERYL 
STTPETTHCRKQpRPVRIQTLVGNIHIKQEMEDDYDYYGQQRVQ 
ILERNESEECTEDTDQAEGTESEPKGESFDSGVSSSIGTEPDSV 
EQQFGPGAARDSQAE PTQ PEQAAEAPAEGG PQTNQLETGAS S P E 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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Predicted 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=> Phenyl alanine, G=Glycine, 

L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine 
S=Serine, ToThreonine, V^Valine, 
W-Tryptophan, Y^Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








* *** wurvrunuyyiyr v x v^yruuo 1 C 1 t\\JlJt'i\lr , \ J l H i tA.S.S A(-»H 

STASGQGEKKPYECTLCNKTFTAKQNYVKHMFVHTGEKPHQCSI 
C WRS FS LKDYL I K\ HMVTHTG VRA YQCS I CNKR FTQKS S LNVHM 
RLHRGEKSYECYICKKKFSHKTLLERHVALHSASNGTPPAGTPP 
GARAG P PG WACTEGTTYV CS VCPAKFDQ I EQ FNDHMRMHVS DG 


6964 


1 


178 ' 


SGRPFFFFFSNTDVYFIKKVTNRWTAGSSYKMTRMKSIGKILLL 
QIFIG\NCSMFVLVI 


6965 


757 


208 


NVFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPLLAALEVCSCGS- 
Csu^ijL* IQILib'lJNtl I u VJjLGQMRRISPFLCLKDRSDFRF 

PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWV I EGSTLALRRY 
FQESISTLE 


6966 


820 


1867 


I ITALGVRGMPGCPCPGCGMAG PRLLFLTALALELLGRAGGSQP 
ALRSRGTATACRLDNKBSES WGAIiLSGERLDTW I CSLLGSLKVG 
LSG V FP LLV I P LEMGTMLRS E AG AWRXiKQLLS FALGGLLGNVFL 
HLLPEAMAYTCSASPGGEGQSLCXXK3QLGLWVIAGILTFLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 
G P\ DI KVSG YLNLLANTI DN FTHGLAVAAS FLVS KKIGLLTTMA 
ILLHE I PHEVGDFAILLRAGFDHWSAAKLQLSTALGGLLGAG FA 
ICTQ3 P KG VE ETAAWVLP FTSGGFLYIALVNVL PDLLEE E D PW 


6967 


162 


633 


GFLPFKYWILDLSASSRMETDCNPMELSSMSGFEEGSELNGFEG 
TDM KDMRLE AEAWNDVL F AVNNMF VS KS LRCADDVA Y I NVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLQTPYHETVYSLLDTL\ 
S PAYREAFGKR \ LLQRLEALKRDGQ S 


6968 


1 


2265 


RGGG GGRGG PGARERER PG E P ERTM EAAAGGRG C FQ PH PGLQKT 
LEQFHLSSMSSLGGPAAFSARWAQEAYKKESAKEAGAAAVPAPV 
P AATE P P P VLH LPAI QP P P P VL PGP FFMPSDRSTE RCETVLEGE 
TISCFWGGEKRLCLPQILNSVLRDFSLQQINAVCDELHIYCSR 
CTADQLEILKVMGILPFSAPSCGLITKTDAERLCNALIiYGGAYP 
PPCKKELAASLALGLELSERSVRVYHE\CFGKCKGL\LVPBLYS 
S PS AACI QCLD \ CRLMYP PHKF WHSH KALENRTCHWGF \DS A\ 
NWRAYILLSQDYTGKEEQARLGR \ CLDDVKEKFD YGNKYKRRVP 
RVSSEPPASIRPKTDDTSSQSPAPSEKDKPSSWLRTLAGSSNKS 
LGC VHPRQRLS AFRPWS PAVSAS EKELS PHLPALI RDS FYS YKS 
FETAVAPNVALAP P AQQKWSS P PCAAAVSRAP E PLATCTQ PRK 
RKLTVDT PGAPE TLA P VAAP E EDKDS EAEVEVE S REE FTS S LS S 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQ ALEGGLDTKEAKE KFLHE WKMRVKQEE KLSAALQAKRS 
LHQE LEFLRVAKKE KLRE ATEAKRNIjRKE I ERJLRAENEKKMKEA 
NESRLRLKRELEQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QHAEADREQLRADLLREREAREHLEK\WK\ELQEQLWPRARPE 
AAGSEG\AAELEP 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRLEREQKLKLYQSATQAVFQKRQA 
GELDESVLELTSQILGANPDFATLWNCRREVLQQLETQKSPEEL 
AALVKAELGFLESCLRVNPKSYGTMHHRCWLLGRLPEPNWTREL 

elcarfletoernfhcwdyrrfvatqaavppaeelaftdslitr 
nfsnysswhyrscllpqlhpqpdsgpqgrlpedvllkelelvqn 
afftdpndqsawfyhrwllgradpqdalrclhvsrdeacltvsf 
srpllvgsrmeilllmvddsplivewrtpdgrnrpshvwlcdlp 
aaslndqlpqhtfrviwtagdvqkecvllkgrqegwcrdsttde 
qlfrcelsvekstvlqselesckelqelepenkwcl\ltiillm 
raldpllyeketlqyfqtlk\awdpkraty\lddlrskfllens 
vlkmeyaevrvlhlahkdltvlchleqlllvthldlshnrlrtl 
ppalaalrcledppprt\vlqasdnaiesldgvtnlprlqelll 
cnnrlqqpavlqplas cprlvllnlqgnplcqavg i leqlaell 
psvssvlt 
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Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V-Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


SPPPIiLSSPSAVGEGKVAVAAPCPGRSECARAXMAYIQLBPLNE 
GFLSRISGL LLCRWTCRHCOQKCY ES SCCQS S EDEVE I LG P F P A 
QTPPWLMASRSSDKDGDSWTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRRISSLESRRPSSPLIDIKPIEFGVLSAKXEPIQPSV 
LRRTYNPDDYFRKFEPHLYSLDSNSDDVDSIiTDEEILSKYQLGM 
UIFSTQYDLLHNHLTVRVIBARDLPPPISHDGSRQDMAHSNPYV 
KICLLPDQKNSKQTGVKRKTQKPVFEERYTFEIPFLBAQRRTLL 
LTWDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 
VEI/aELLLSL^LPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
FTV FGHNMKS SNDF I GR I V I G \ Q YS SGP \ S E PNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVSPASLEVT 


6971 


37 


3702 


ACFYVPGSRSFKIjI PRHGLVNKGRSGKLPSGVSAKLKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDbTVDAVKLHNELQSGSL 
RLGKSEAPETPMEEEAELVLTEKSSGTFLSGLSDCTNVTFSKVQ 
RFWESNSAAHKEICAVLAAVTEVIRSQGGKETETEYFAALIRKA 
AQHG VC S VL KGS E FM FE KAP AHHP AA I STAKFCI QE I E KS GGS K 
EATTTLHMLTLL KDLL P CFPEGLVKS CS ETLLRVMTLS HVLVTA 
CAMQAFHS L FHAR PGLS TLS AE LNAQ 1 1 TALYD YVPSENDLQPL 
LAWLKVME KAH I NLVRLQ W D LG LGHL PR FFGTAVTCLLS PHSQV 
LTAATQSL.KE I LKECVAPHMADIGS VTSSASGPAQS VAXMFRAV 
EEGLTYKFHAAWSSVLQLLCVFFEACGRQAHPVMRKCLQSLCDL 
RLS PH F PHTAALDOAVGAA VTSMG PE VVLQAVPLEI DGSEETLD 
FPRS WLLPVI RDHVQETRLG FFTTY FLPLANTLKSKAMDLAQAG 
ST VE S KI YDTLQWQMWTLLPGFCTRPTDVAI S FKGLARTLGMAI 
SERPDLRVTVCQALRTL ITKGCQAEADRAEVSRFAKNFLP I LFN 
LYGQ PVAAGDTPA PRRAVLETIRTYLTITDTQLVNS LLEKAS EK 
VLDPASSDFTRLSVLDLWALAPCADEAAISKLYSTIRPYIjESK 
AHG VQKKAYRVLEEVCAS PQGPGALF VQSHLEDLKKTLLDS LRS 
TSSPAKRPRLKCLLHIVRICLSAEHKEFITALIPEVILCTKEVSV 
GARKNAFALLVEMGHAFLRFGSNQEEALQCYLVLIYPGLVGAVT 
MVSCSILALTHLLFEFKGLMGTSTVEQLLENVCLLLASRTRDW 
KSALGFIFCVAVTVMDVAHLAKHVQLVMEAIGKLSDDMRRKFRMK 
LRNLFT \ KFI P K \ FG I LTWG KKAVG P KEYHR VLVN I RKAEARAK 
RHRALSQAAVEEEEEEEEEEEPAQGKGDSIEEILADSEDEEDNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 
TQPGPGRGRKKDHS FKVSADGRLI IREEADGNKMEEEEGAKGED 
EEMADPMEDVI IRNKKHQKLKHQKEAEEEELEI PPQYQAGGSGI 
HRPVAKKAMPGAEYKAKKAKGDVKKKGRPDPYAYI PLNRS KLNR 
RKXMKLQGQFKGLVKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAILLPLWRRTRPREATVPRGAAQRGRARSAEGRIPSSQSPS 
PAEAGGATRSPPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAATBRARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
A I TLBS PDIKYPLRLIDRBI ISHDTRRFRFALPSPQHILGLPVG 

riWTVT.Q^DTrviMT tn/D DVTD TOO r\nnrr< tnfnr \it mrwirr\mt*Y\v 
yni iJj^i>\Ki.lAjJNijV VKfX 1 PXSoUUDKGrVDLVIKViFKDTHPK 

fpaggkmsqylesmqigdtiefrgpsgllvyqgkgkfairpdxk 
snp 1 1 rtv ks vgm i aggtg i tpmlq v i raimkdpddhtvchll f 
anqtekdiiilrpeleelrnkhsarfklwytldrapeawdygqg\ 
fvneemirdhlpppe\eeplvlmcgpppmiqvaclpnl\dhvgh 

PTERCFVF 


6973 


1 


1964 


LQPRCAHRGLRAQKCGR papgvdamvlc p vigkllhkrwlasa 
sprrqeilsnaglrfewpskfkekldkasfatpygyametakq 

KALE VANR LYQKDLRA PDW I G ADTI VT VGGL I LE KPVDKQDAY 
RMLSRFE/ SGREHS VFTGVAI VHCSSKDHQLDTRVS E FYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNHFCKQLVKLYYPPRPEDLRRSVKHDSIPAADTFBDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" - 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E*= . 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , VaValine, 

jr^tu^urtii, leiyroBine, a «=■ u t\k nown , * t op 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


" 6974 






SDV3GGGSEPTQRDAGSRDEKAEAGEAGQATAEABCHRTRETLP 
P F PTRLLELI EGFMLS KGLLTACKLKVFDLLKDEAPQKAADI AS 
KVDASACGMERLLDICAAMGLLEKTEOGYSOTETANVYLASDGE 
YSLHGFIMHNNDLTWNLFTYLEFAIREGTNQHHRALGKKAEDLF 

ODAYYO^DFTDT D rMDKMUniTVT T»7\ /*</-n rum-*. ■ iw i u t\ •>-.-. — » 

^ •< " 1 KliKr mtAMHUHTKLTACQVATAFNTtSR PSS ACDV 
GGCTGALAR E LARE YPRMQVTVFDLP D 1 1 ELAAHFQP PG PQ A VQ 
IHFAAGDFFRDPLPSAELYVLCRILHDWPDDKVHKLLSRVAESC 
K PG AGLLLVETLLDE E KR VAQRALMQS LNMLVQTEGKERS LG E Y 

OCLL£LHOFHD\/nV\rUTJT/;\rT riATr \ nnirunnp^««nnr 




3082 


2172 


RSCAAFASFASRPPLELFAPPGSHRSPPGRGVATSAQCALSVRk 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSIiPTSAPLSV 
S L PTNI VP PTTI WTS S PQNTDADTAS PS NGTHNNS VL P VTASA P 
TSLLPKNISIESREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH 
LTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPOAPASSPSSL 
o iorfavr i»ii3V i liXHbSTVTSTQPTGAPTAPESPTEESSSDHT 
PTS HATAE P V PQE KTP PTT VSG KVM CELIDMET\PPP FPG 


6975 


2 


500 


R PR PT VH CCKWALKLE TAME TLINVFHAHS GKEGDK YKI*S KKEL 
KELLQTELSGFLDVKELML*ATEALKTFEEA* KSPI IQCSSSRS 
SLPPAPQPPPYL*IjSAVPFPlHLPLPLLPPQAQKDVDAVDKVMK 
BLDENGDGE VDFQE Y WLVAALTVA CNNFFWENS 


6976 


1216 


970 


wuyu ~ VMi<j i 1 ENS P VTFAHFPEDTVEQKAES VGRIMPHTEAR I 
MNMEAGTIiAKLNTPGELCIRGYCVMLGYWGEPQKTEEAVDQDKW 
j w i ouva i rojM&ytj* CKI VGRSKDMIIRGGENI YPAELEDFFHTH 
PKVQEVQWGVKDDRMGEEICACIRLKDGEETTVEEIKAFCKGK 
ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL*IKQQ 
ACPGRIA 


6977 


1298 


588 


SLFINTNLLSNQIRKTSFGMCSEPISDNTEDQKGKLKTPDFA*R 
AKKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VYSTQVGFAGGYTSNPTYKEVCSEKTGHAEWRWYOPEHMSFE 
ELLKVTWE^DPTG^MRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KEN YQ KVLS EHGFG P I TTDIREGQTF YYAEDYHQQY LS KN PNG Y 
CGLGGTGVS CP VGIKK 


6978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKLSKEAKQRLQQLFKGSQ' 
FAIRWGF I PLVI YLGFKRGADPGMPEPTVLSLLWG 


6979 
6980 


3917 
1 


1146 
420 


DEARVRGEAVAAAILSRCRHWSGPPPFPPSPPDRKGLRGTEPWE 
AGPGSGATPGAJtRMDVRRLKVNELREELQRRGLDTRGLKTELAE 
RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSEIjEGT 
AQPPPPGLQPHAEPGGYSGPDGHYAMDNITRQNQFYDTQVIXQE 
NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTSFLPPEASQL.KP 
DRQQFQSRKRPYEENRGRGYFEHREDRRGRSPQPPAEEDEDDFD 
DTLVAIDTYNCDIiHFKVARDRSS G YPIiTI EG FAY LWSGARAS YG 
VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 
GEEPFSYGYGGTGKKSTNSRFENYGDKFAENDVIGCFADFECXjN 

dvels ftkngkwmgiafr i qkealggqalyphvlvkncavefnf 
gqraepycsvlpgftfiqhlplserirgtvgpkskaecbilmmv 

GLPAAGKTTWAIKHAASNPSKKYNILGTNAIMDKMRVMGLRRQIl 

nyagrwd\^iqqatqclnrliqiaarkkrnyildqtnvygsaqr 
rkmrpfegfqrkaivicptdedlkdrtikrtdeegkdvpdhavl 
emkanftlpdvgdfldevlfieloreeadklvrqyneegrkagp 
ppekrfdnrggggfrgrgggggfqryenrgppggnrggfqnrgg 
gsggggnyrggfnrsggggysqnrwgnnnrdnnnsnnrgsynra 
pqqqpppqqppppqpppqqpppppsysparnppgastynknsni 
pgssantstptvssysppqsfgffpstfqpsysqppynqggysq 
gytapppppppppaynygsyggynpapytppppptaqtypqpsy 

NQYQQYAQQWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTSTQ 
GTRGRKTGRVAAPSTRRRTGNMQKLQTRSPAMSLSDPGLGYHPT 
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SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid ■ 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
1 v.-uyateine / i/=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=»Isoleucine, KeLysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide inqpri-jnn^ 








CWTLRW PPLCSLHALHVFHCLFS SRLGTPVS PRLAMD PNCS CEA 

GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGC1CKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRAii,RPAFAARGVFQGGLGQAKQARTRACAAljPTPHPS 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCliVSVPPT 
RQSLRKLKQRFPSQSQATLLSIFSQEYQKHIKRTHAKHHTSEAI 
ES YYQRYLNGWKNGAAPVLLDLANEVDYAPSLMARLI IiERFLQ 
EH EETP PS KS I INS MLRDP SQ I PDGVLANQ VYQC I VND CCYG P L 
vi^v-j.raiMj.^nc.tiisvjjbKUljLIjEKNbSFLDEDQLRAKGYDKTPDF 
ILQVPVAVEGHI IHWIESKAS FGDECSHHAYLHDQFWS YWNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTNIVTLCHSIA 


6982 


153 


1285" 


FPQQDCSAPAAPGLAGSEPRRLRAYRRRRQRARGLKRVAWuAPP 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHPVI KAFLCGS I SGTCSTLLFQPLDLLKTRLQTLQ 
P5 DHGSRRVGMLAVLLKVVRTESLLGLWKGMS PS I VRCVPGVG I 
YFGTLYSLKQYFlrRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTRYESGKYGYESI YAALRSI YHSEGHRGLFSGLTATLLRDAPF 
SG I YLM FYNQTKNI VPHDQVDATL I PlTNFSCGI FAG I LAS LVT 

QPADVIKTHMQLYPLKFQWIGQAVTLIFKDYGLRGFFQGGIPRA 

IjPP TT.MA AMi Wn/VTJPMMRVuriT iro 
* ul'J/iftl'lAWi v xBtiMMAKWQIjKS 


6983 


82 


773 


EMSFLQDPSFFTMGMWSIGAGALGAAALALLLANTDVFLSKPQK 
AALEYLEDIDLKTLEKEPRTFKAKELWEKNGAVIMAVRRPGCFL 
CREEAADLSSLKSMLDQLGVPLYAWKEHIRTEVKDFQPYFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
vauur lijuovt VVGSGKQGILLEHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


1845 


1282 


GGR S AYS LP AGS L P R VPATAAAKMASG VQVAD E VCR I FYDMKVR " 

KCSTP EEIKKRKKAVI FCLSADKKC 1 1 VEEGKE I LVGD VGVTI T 

DPFKHFVGMLPEKDCRYALYDASFETKESRKEELMFFLWAPELA 

PLKSKMIYASSKDAIKKKFQGIKHECQANGPEDLNRACIAEKLG 
GS Ij I VA C P V 


6985 


1887 


1324 


RRTAG I YPCF PKPGRTRHALCS WLLLLTGQLAFDD FQES CAMM 
WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 
YKLAVEQLQSHPEAQEALGPPLNIHYLKLIDRENFVDIVDAKLK 
IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLELKDGQQI PVFK 
LSGENGDEVKKE 


6986 


642 


1350 


yhlyfkmgdpnsrkkqalnrlraOlrkkkesij^qfdfkmyiaf 

VFKEKKKKSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSL 

ellqkdwqlhapryqsmrrdvigctqemdfilwprndiekivc 
llfsrwkesdepfrpvqakfefhhgdyekqflhvlsrkdktgiv 

avgtiedhlrpympe 


6987 


1623 


341 


leaaekasrafkesqrqtdsknyetenwspqksqrrydmyntac 
flgeievglytiqilqltpffhkenelskkhmvqflsgkwtipp 
dprnecylalskftshlknlqsdlkrcfdffidymvllkmrytq 
keiaeimlskkvsrcfrkytelfchldpcllqskesqllqeenc 
rkklealradrfaglleylnpnykdattmesivneyafllqqns 

KKPMTNEKQNSIIiANIILSCLKPNSKLrQPLTTLKKQLREVLQF 
VGLSHQYPGPYFLACLLFWPENQELDQDSKLIEKYVSSLNRSFR 
GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 
SLWHSGDVWKKNEVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
I S VYS G PLRSG RN I ERVS FYLG FS I EG PPGL 


6988 

_ 


3 


£89 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 

lvrglgaastaapqdaqtgpqpmpradcimrhlpyfcrgqwrg 

FGRGS KQ LG I PTAN F P EQWDNLP AD I STG I YYGWAS VGSGDVH 

kmwsigwnpyykntkksmethimhtfkedfygeilnvaivgyl 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seament conY => -i r» •» nrr D > — i — i , 

^ wuiitaininj signal peptide 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid, F^Phenylalanine, G=Glycine, 

H=Histidine, I=Isoleucine, K^Lysine, 

L=Leucine, M=Methionine, N=Asparagine, 

P=Proline, Q=Glutaraine , R=Arginine, 

S=Serine, T= Threonine, VeValine, 

W=Tryptophan, Y=»Tyrosine, X«Unknown, *=*Stop 

Codon, /^.possible nucleotide deletion, 

\=possiblc nucleotide insertion) 








RPBKNFDSLESLISAIQGDIEEAKKRLELPEHLKIKEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPLS PSTHASAGSHCHAPPTTARRAFPl PFGS KSNMATL 

ALVDVIEDKLKGEMMDLQHGSLFLRTPKIVSGKDYNVTANSKLV 
1 1 TAGARQQEGESRLNLVQRNVNI FKFI I PNWK YS PNCKLL I V 
SNPVDILTYVAWKISGFPKNRVTf?«lf;r>JT.ri<!aT3TOvT mcpot r>v 

HPI^OTGWVLGEHGDSSVPVWSGMNVAGVSLKTLHPDLGTDKDK 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADLAESIMKNLR 
R VHP VS TM I KGL YG I KDDVFLS VP C I LGQNG I S DLVKVTLTS EE 
EARLKKSADTLWGIQKELQF 


6990 


719 


258 


THASGMASVVLALRTRTAVTSLIiSPTPATALAVRYASKKSGGSS 
KNLGGKSSGRRQGIKKMEGHYVHAGNI I ATQRH FRWHPGAHVGV 
ivv - lj i nijcjCjij x vrt 1 1Kb vx v^hpkntBAVjjLITRLPKGAVLY 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RR S S D FHN PG FLS R P VS LREN I H HQ VI CSTKN KRRN P KK'IAYXL 

SSLLMTNIiNPNESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQAPGCSSLALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGALY 
n o r rax y isr* f f f r fvifo f TAP i P P Y P PQPMG PGPMGG P YP P PQG Y 

PYGXSYPQYGWQGGPQEPPKTTVYVVEDQRRDEliGPSTCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


qwcvtcpohnarqgpavppgiqaygaapfedJLqvdftemskcrg 

DRVWIKWNVASLCPLWKGPQTVVLSPPTAVKVEGIPAWIHHSH 
VKPAARETWEARPS PDNP FR VTLKKTTS PA P VTPGS 


6994 


346 


1100 


QWPEKDPVMAASSISSPWGKHVFKAILMVLVALILLHSAIiAQSR 

rdfappgc^kreapvdvltqigrsvrgtldawigpetmhlvses 
ssqvlwaissaisvaffalsgiaaqllnalglagdylaqglkls 

PG0VQTFLLWGAGALWYWLLSLLLGLVLALU5R I LWGLKLV1 F 
u/ujr uHLdiKo v fut'o I KALIJjIiALLILYALLSRLTGSRASGAOL 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 


6995 


144 


1346 


GSVAVGLSGIMAAQKDLWDAIVIGAGICGCFTAYHLAKHRKRIL ' 

LLEQFFLPHSRGSSHGQSRIIRKAYLEDFYTRMMHECYQIWAQI, 

BHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 

EELKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRALQDArRQLG 

GIVRDGEKWEINPGLLVTVKTTSRSYQAKSLVITAGPWTNQLL 

RPLGI EMPLQTLRINVCYWREMVPGS YGVSQAFPCFLWLGLCPH 

HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 

SSFVRDHLPDLKPEPAVIESCMYTNTPDEQP1LDRHPKYDNIVI 

GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KAIIL 


6996 


543 


1942 


ETANAEAAARKSAMDWKEVLRRRIiATPNTCPNKKKSEQELKDEE " 

MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQKLR 

EES RAVFLQRKS RE LLDNE E LQNL W FL LDKHQT PPMIGEEAMIN 

YENFLKVGEKAGAKCKQFFTAKVFAKLliHTDSYGRISIMQFFNY 

VMR KVWLHQTR I GLS LYDVAGQG YLRES DLENY I LELI P TL PQL 

DGLE KS FYS FYVCTAVRKF FF FLDP LRTGK I K I QD I LAC S FLDD 

LLELRDEELSKESQET^FSAPSALRVYGQYLNLPKDHNGMLSK 

EELSRYGTATMTNVFUDRVFQECLTYDGEMDYKTYXDFVLALEN 

RKEPAALQYIFKLLDIENKGYLNVFSLNYFFRAIQELMKIHGQD 

PVSFQDVKDE1FDMVKPKDPLKISLQDLINSNQGDTVTT1LIDL 

NGFWTYENREALVANDSBNSADLDDT 


6997 


370 


1104 


AMELTIFILRLAIYILTFPLYLLNFLGLWSWICKKWFPYFLVRF " 

TVIYNEQMASKKREliFSNLQEFAGPSGKLSLLEVGCGTGANFKF 

YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFWAAGENMH 
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SEQ 
XD 
NO: 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan f Y=Tyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=poseible nucleotide insertion) 








QVADGS VDVWCTLVLCS VKNQER I LREVCRVLRPGGAF Y FMEH 
VAAECSTWNYFWQOVLDPAHHLLFDGCNLTRESWKALERASFSK 
LKLQHIQAPLSWELVRPHIYGYAVK 


6998 


2 


616 


F VS RAXiLRVRSRRH PAEERAAPGR PEDAP IECPGATNCP E PLWC 
SHLPVPYAP PTMESRGKS ASS PKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAP AEKQQP PAAPTTAPAKKTSAKAD PALLNNHSN 
LKPAPTVPSS PDATPEPKGPGDGAEEDEAASGGPGGRGPWS CEN 
FNPLLVAGGVAVAAIALI LGVAFLVRKK 


6999 


14 


1591 


GRAGACSRRDTAMS I E I ESSDVIRLI MQYLKENSLHRALATLQE 
ETTVS LNTVDS I ES FVAD INSGHWDTVLQAIQSLKLPDKTL I DL 
YEQWLEL1ELRELGAARSLLRQTDPMIMLKQTQPERYIHLENL 
LARS YFDPREAY PDGSS KEKRRAAI AQALAGBVS WPPSRLMAL 
L£ QALKWQQHQG LL PPGMT I DLFRG KAAVKDVE E E KFPTQ LS RH 
I KFGQ KS HVE CAR FS PDGQ YL VTGS VDGF IE VWNFTTG K I RKDL 
KYQAQDNFMMMDDAVLCMCFSRDTEMLATGAQDGKIKVWKIQSG 
QCLRRFERAHSKGVTCLSFSKDSSQILSASFDQTIRIHGLKSGK 
TLKE FRGHSS F VN EAT FTQDGH Y 1 1 S AS S DGTVKI WNMKT TECS 
NTFKSLGSTAGTDITVNSVILLPKNPEHFWCNRSNTWIMNMQ 
GQ I VR S FS SG KREGGD F VCCALS P RGE W I YCVGEDFVL YC FSTV 
TGKLERTLTVHEKDVIGIAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGVVFLELNESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNP LLQPALTGDVEGLQKI FEDPENPHHEQ AMQLLLEED 1 VGRN 
LLY AACMAGQSDV I RALAKYGVNLNE KTTRGYT LLHCAAAWGRL 
ETLKALVELD VD I EALNFRE ERARDVAAR YSQT ECVE F ti DW ADA 
RLTLKKYIAKVSLAVTDTEKGSGKLLKEDKNTILSACRAKNEWL 
ETHTEAS INELFEQRQQLED I VTPI FTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 


20S6 


844 


RRCLIIAFLKGCFIFIYFIFIFETEFLSCCPGWSAVAQSRLIAN 
FASQVQAIFILPKDSQVGPDVKSEAAPKRALYESVFGSGEICGP 
TSPKRLCIRPSEPVDAWWSVKHDPLPLLPEANGHRSTNSPTI 
VSPAI VSPTQDSRPNMSRPLITRSPAS PLNNQGI PTPAQLTKSN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 
HYFIDRDGQMFRYILNFLRTSKLLIPDDFKDYTLLYEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
S LIEE VFPE IGDVHCNSVNAGWNHDSTHVIRFPLNG YCHLNS VQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVLRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


498 


PMPS S TR WTTS * T Y TDTS S AWACRPTTG TCT * TAA PG PT VR W W P 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGTATSGTATTTSWJPGCGTRHWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAKGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRIiSALLALASKVTLPPHYRYGMSPP 
GSVADKRKNPPWIRRRPWVEPISDEDWYLFCGDTVEILEGKDA 

nvrYiK^n/rtv/TDODMWtnn/rjr^T MTuvovTrvTMrvcnTMroopAn 
Vyjvuuixv v\J v JLKyKJNVl V V vuvjJuDi 1 n X K I Hjft.! FUJ X KLi 1 n JL FbfcAr 

LLHRQVKLVDPMDRKPTEI E WRFTEAGERVRVSTRSGRI IPKPE 
F PRADGI VPETWI DG P KDTS VE DALE RT Y VPCLKTLQE EVKEAM 
G I KETR \ NTRRS I G X E PGAEQ LL PNFCP S LEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\PKRTLKTQLG/YYCRVRPLGFPDQECCIEVINNTTVQLHTPE 
GYRLNRNGDYKETQYSFKQVFGTHTTQKELFDWANPLVNDLIH 
GKNGLLFTYGVTGSGKTHTMTGS PGEGGLLPRCLDM I FNSIGS F 
QAKRYVFKSNDRNSMDIQCBVDALLERQKREAMPNPKTSSSKRQ 
VDPEFADMXTVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLL 
EEVPFDPINPNLHNLNCFVKiramNMYVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHIiNRESSRSHSVFNIKLVQAPLDADGDNVli 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidlne, I=Isoleucine, K^Lyslne, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W.Tryptophan, Y=Tyrosine, X«> Unknown, *sStop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








QEKEQIIISQLSLVDLAGSERTNRTRAEGNRliREAGNINQSLMT " 
I^TCMDVLRENQMYGTNK^PYRDSKLTHLFKNYFDGEGKVRMI 
VC VNPKAED YEENLQVMR FAEVTQBVE VARP VDKA I CGIiTPGRR 
YRNQPRGP\IGNEPLVTDWLQSFPPLPSCEILDINDEQTLPRL 
I EALEKRHNLRQMM I DE FNKQSNAFKALLQE FDNAVLS KENHMQ 
GKLNEKEKMISGQKLEIERLEKKNKTLBYKIEILEKTTTIYEED 
KRNLQQELE TQNQKLQRQ FSDKRRDEARbQGMVTE TTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRS VSPS PVPVSYL 


7005 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWtAERLGLFEEL " 

WAAQVKRLASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQLARQ 

I S S TLADTAVAAQVNGE P YDLER PLETDSDLR FLTFDS PEGKAV 

FWHSSTHVLGAAAEQFLGAVLCRGPSTEYGFYHDFFLGKERTIR 

GS ELPVLE R I CQELTAAAR PFRR LEASRDQLRQLF KDNP FKLHL 

IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGQIGGLKLLSNSSS 

LWRSSG 


7006 


22 


898 


NAFGRkSTAVKMAAAAWLQVLPVI LLLLGAHPS PLSFFSAGPAT " 

VAAADRSKWHIPIPSGKNYFSFGKILFRNTTIFLKFDGEPCDLS 

LNITWYLKSADCYNEIYNFKAEEVELYLEKLKEKRGLSGKYQTS 

SKLFQN'CSELPKTQTFSGDFMHRLPLLGEKQEAKENGTNIiTFIG 

DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKENSLSNLFTMT 

VEVKGPYEYLTLEDYPLMIFFMVMCIVYVLFGVLWLAWSACYWR 

DLLRIQFWIGAVIFLGMLEKAVFYAGFG 


7007 


2 


1001 


AMT VSG PG TP E PRPATPGAS S VEQLRKEGNE LFKCGD YGG ALAA " 
YTQALGLDATPQDQAVLHRNRAACHLKLEDYDKAETEASKAIEK 
DGGDVKALYRRSQALEKLGRLDQAVLDLQRCVSLEPKNKVFQEA 
LRN IGGQI QEKVRYMSSTDAKVEQMFQI IiLDPEEKGTEKKQKAS 
QNLWLAREDAGAEKIFRSNGVQLLQRLLDMGETDLMLAALRTL 
VGICSEHQSRTVATLS I LGTRRWS ILGVESQAVSLAACHLLQV 
MFDALKEGVKKGFRGKEGAIIVGEWKQVWGLLDVTWIEGMGLSQ 
PGQFFGDQTCSCRLFGIRFGDI ILL 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWLGRPPSGLPPGPR 
SPPPLAGPGQKMVQKKPAELQGFHRSFKGQNPFELAFSLDQPDH 
GDSDFGLQCSARPDMPASQPIDIPDAKKRGKKKKRGRATDSFSG 
RFEDVYQLQEDVLGEGAHARVQTCINLITSQEYAVKIIEKQPGH 
I RS RVFRE VEML YQ CQGHRNVLEL I E F FEE EDRFYL VFE KMRGG 
S I LSH I HKR RH FNELE AS VWQDVAS ALD FLHNKGI AHRDLKPE 
NILCEHPNQVSPVKICDFDLGSGIKLNGDCSPISTPELLTPCGS 
AE YMAPEWEAFS E EAS I YDKRCDLWSLGVILYILLSG YPPFVG 
RCGS D CG WD RGEAC P ACQNMLFE S I QEGRYE F PDKDWAH I SCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNB 


• 7009 


1 


626 


ARQLRNSWVDDFVAAPLI PLSQQI PTGNSLYESYYKQVDPAYTG 
RVGASE AALF LKKS GLS D 1 1 LG KI WDLADPEGKG FLD KQG F YVA 
LRLVACAQSGHEVTLSNLNLSMPPPKFHDTSSPLMVTPPSAEAH 
WAVR VEE KAKFDG I FES LLP I NGLLS GDKVK P VLMNS KL PLD VL 
GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLSPLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
G S VRAAL VDQS GVL LAFADQ P I ION WE PQ FNHHEQS S ED I W AACC 
WTKKWQGIDLNQIRGLGFDATCSLWLDKQFHPLPVNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


RIQTLPNQNQSQTQPLLXTPPAVLQPIAPQTTFGVQTQPQPQSL 
LQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRLDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERSPRRERERSPRRVRRWPRYTVQFSKFSLDCPSCDMM 
ELRRPYQNLYIPSDFFDAQFTWVDAFPLSRPFQLGNYCNFYVMH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
iH-Mianine, L»<Jysteine , D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, G=Glutamine, R^Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, ^Tyrosine, X=Unknown, '-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RE VES LE KNMAI LD P PDADHLYS AKVMLMAS PSMEDLYHKSCAL 
AEDPQELRDGFQHPARI,VKFLVGMXGKDEAMAIGGHWSPSLDGP 
DPEKDPSVLIKT\AIRCCKALTG 


7012 


; l 


2661 


rragsvkrgearlfgpterOserplrpsaarrpemlsgkkaaaa 

AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAGbSGPAEVGPGA 

vgertprkkepprasppgglaeppgsagpqagptvvpgsatpme 
tgiaetpeg\rrtsrrkrakveyremdeslanlsedeyysbeer 
nakaekekklpppppqappeeenesepeepsgvegaafqsrlph 
drmtsoeaacfpdiisgpqqtqkvflfirnrtlqlwldnpkiql 
tfeatlqqleapynsdtvlvhrvhsylerhglinfgiykrikpl 
ptkktgkviiigsgvsguu^qi^sfgmdvtlleardrvggrv 

ATFRK^TVADLGAMWTGLGGNPMAWSKQVNMELAKI KQKCP 

lyeamgqavpkekdemveqefnrlleatsylshqldfnvlnnkp 
vslgqalewiqlqefchvkdeqiehwkkivktqeelkellnkmv 
nlkekikblhqqykeasevkpprditaeflvkskhrdltalcke 
ydeiiaetqgkleeklqeue anp psdvylssrdrqildvffl fanle 
fanatplstlslkhwdqdddfeftgshltvrngyscvpvalaeg 

LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFIYKCDAVLCTL 

plgvlkcxjppavqfvpplpewktsavqrkgfgnlnkwlcfdrv 

FWDPSVNLFGHVGSTTASRGELFLFWNLYKAPILLALVAGEAAG 

imenisddvivgrclailkgifgssavpqpketwsrwradpwa 

RGSYSYVAAOSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7013 


i 


2661 
♦ 


RRAGSVKKGKARLFGPTER06ERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGSEVAAQPAGLSGPAEVGPGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMbESLANL3EDEYYSEEER 
NAKAE KEKKLP PP p PQAP PEE ENES E P EE P SG VEG AAFQS RL PH 
DRMTSQEAAC F PDI I SG PQQTQ KVFLF I RNRTLQLWLDN P KI QL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
ATFRKGNY VADLGAMWTGLGGNPMAWS KQVNMELAKI KQKCP 
l»i fi^^AVFK£KDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 
VS LG QALEWI QLQEKHVKDEQI EH W KK I VKTQE ELKELLNKM V 
NLKEKIKELHQQYKEASEVKPPRDITAEFLVKSKHRDLTALCKE 
YDELAETQGKLBEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVAIAEG 
LDI KLNTAVRQVR YTA5GCE V I AVNTRS TSQTF I YKCDAVLCTL 
PLGVLKC3QPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPSVMLFGHVGSTTASRGELFLFWNLYKAPILLALVAGEAAG 
IMENISDDVIVGRCLAILKGIFGSSAVPQPKETWSRWRADPWA 
RGSYS YVAAGSSGNDYDLMAQPITPGPS I PGAPQPI PRLFFAGE 

HT2RNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETMAL PQEGS LAR I P ETSLDCLENTI/3 VE EQRHETS DHEAEE P D 
CIISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVEWEM 
PLATDSPTSDPTBWNGISSQPQVPFHPNLQKSQYYSTVGGSHP 
HS EQY PDLLPLEARTRDYASLP P KRM Y5QLKTLQK P VLPLYRG S 
S VSASR WKPRQS S PQLHNLAS YTKKHHTSS VYS I S ERLEMK PG 
PQAQGLVMEAATHSQGDGSTDLDS Kl/TQQbl EFEKSLAGPGTE P 
DKILRHFSIMDFNSEKDIVRGSSICLITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDQNLKPAPPLWRPSRPAPLPPSAQQRTNAV 
S PKLLSRHRPTCETLE KEGPGHMGRS LDQ TS PC PLVLVR I E EM E 
RDLDMY S RAQEE LNLMLEE KQDES S RAETLED LKFCESNIES LN " 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PaProline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NELQQLREM7 LLSSQS SS LVAPSG5VSAEN P EQRMLE KRAKV I E 
BLLQTERDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQMV 
I KVSKQLIiAALE ISDAVGPVFLGHRDELBGTY KI YCQNHDEAI A 
LLEIYEKDEKIQKHLQDSLADLKSLYNEWGCTNYINLGSFLIKP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKEINVNINE 
YKRRKDLVLKYRKGDEDSLMEKISKLNIHSIIKKSNRVSSHLKH 
LTGFAPQIKDEVFEETEKNFRMQERLIKSFIRDLSIjYLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQLFTNFKER 
TERLV IS PLNQIiUSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLE ELQ S ARNNY BALNAQLLD EL PX FHQYAQGLFTNCVHG Y 
AEAHCDFVHQALEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRASLLARYPPEKLFQAERNFNAAQDLDVSLLEGDLVGVIKKK 
DPMGSQNRWLI DNGVTKG FVYSS FLKPYNPRRSHSDAS VGS HSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRSYRNFRHPEIVGYSVPGRNGQSQDLVKG 
CARTAQAPEDRSTEPDGSEAEGNQVYFAVYTFKARNPNELSVSA 
NQKLKXtiEFKDVTGNTEWWIiAEVNGKKGYVPSNYIRKTEYT 


7015 


1B42 


| 513 


RQAWHE \ VAAP SWRGARLVQS VLRVWQVGPHVARERV1 P FSSLL 
GFQRRCVSCVAGSAFSGPRLASASRSNGQGSALDHFLGFSQPDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSRVLRWLLGAPNAG 
KSTLS NQLLGR KVF P VS RKVHTTRCQALGVI TEKETQV ILLDTP 
GIlSPGKQKRHHLELSLIiEDPWKSMESADLWVLVDVSDKWTRN 
QI*SPQLLRCLTKYSQIPSVLV^KVDCLKQKSVIjIjEIjTAALTEG 
WNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FM L S ALS QE DVKT LKQ YLLTQ AQPGP WE YHS AVLTS QTPE 
EICANIIREKLLEHLPQBVPYNVQQKTAVWEEGPGGELVIQQKL 
LVPKES YVKLLIG PKGHVI SQ I AQEAGHDLMDI FLCDVDI RLSV 
KLLK 


701$ 


167 


2513 


I LNAP KPPPPRDS VEAVAAKRDTGGGS WGTGMDVSGQETDWRST 
AFRQKLVSQIEDAMRKAGVAHSKSSKDMESHVFLKAKTRDEYLS 
LVARLIIHFRDIHNKKSQASVSDPMNALQSLTGGPAAGAAGIGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSSSSSSRRRYSSSSSSSNSKQ 
FQAQQSmQQ\QFQA\VVQQQQQL\QQQQQQQQHLIKLHHQNQQ 
QIQQQQQQLQRIAQLQLQQQQQQQQQQQQQQQQALQAQPPIQQP 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQOPPVAQNQPSQ 
LPPQSQTQPLVSQAQAliPGOMLYrQPPIiKFVRAPMWQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQS S LPMLSS PS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QS PVTARTPQNFS VPS PGPLNTPVNPSSVMSPAGSSQAEEQQY 
LDKLKQIiSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
SKRCPLKTLQKCEIALEKLKNDMAVPTPPPPPVPPTKQQYLCQP 
LLDA VLANI RS P VFNHS Li YRT FVPAMTA T KR P P 7 TA P VWTP K" P 
RLEDDERQS I PS VLQGE VARLDPKFLVNLDPSHCSNNGTVHLI C 
KLDDKDLP SVPPLELS VPADYPAQS PLWIDRQWQYDANPFLQS V 
HR CMTS RLLQLPDKHS VTALLNTWAQS VHOACL S AA 


7017 


1 


1785 


I NLGNTCYMNS V I * ALFMATD FRRQVLS LN LNG CNS LM KKLQHL 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFLLDR 

lheeekilkvqashkpseilecsetslqevaskaavltetprts 

DGEKTIjIEKMFGGKLRTHIRCLNCRSTSQKAI^FTDLSIiAFWPS 
YSLEYMSCPDCSQSPSIQDGGLMQASVPGPSEEPWYNPTTAAF 
ICDSLVNEKTIGSPPNEFYCSENTSVPNESNKILVNKDVPQKPG 
GETTPSVTDLIiNYFLAPEILTGDNQYYCENCASLQNAEKTMQIT 
EEPEYLILTLLRFSYDQKYHVRFJCILDNVSLPLVLELPVTCRITS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end"" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, • 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine . M=Methioninp» K-ncna^inn 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y»Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FSSLSESWSVDVDFTDLSENLAKKLKPSGTDEASCrKLVPYLLS 
SVWHSGISSESGHYYSYARNITSTDSSYWYHQSEALAbASSQ 
SHLLGRDSPSAVFEQDLENKEMSKEWFLPNDSRVTFTSFQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
LMDAITRDNKLYLQEQELNARARALQAASASCS frpngfddndp 
PGS CG PTGGGGGGG FNTVnrc T.VP 


7018 " 


484 


1066 


SLVFRGNTWSGEAGHHCSAIiFNLAAYHQLFVGTERIRAPEI I FQ 
PSLIGEEQAGIAETLQYILDRYPKDVQEMLVQNVFLTGGNTMYP 
GMKARMBKELLEMRPFRSSFOVQLASNPVLDAWYGARDWALNHL 
DDNEVWITRKE YEEKGGE YLKEHCASNI YV P I RL PKQASRSS DA 
QAS S KG5 AAGGGGAGEQA 


7019 


1048 


335 


APGG FLVTM VFPAPS P PWMLGCCS HE VTAG P PTLCKDMS ALVAA 
RNRKIPLAPQSDWRDLPNIEVRLSDGTMARKIaRYTHHDRKNGRS 
S SGALRG VCSCVEAGKACDPAARQ FNTL I PWCLPHTGNRHNHWA 
G LYGRXiEWDGF FSTT VTNP E PMG KQGR VLH PEQHR WS VRECAR 

SQGFPDTYRLFGNILDraRQVGNAVPPPIJUCAIGLBIKLCMLAK 
ARESASAK I KEEEAAKD 


7020 


1 


2154 


FADSKRKSVLLDKIKNLQVALTSKQQSLETAMSFVARNTFKRVR " 
NGFLMRKVAVFFSNTPTRASPQLREAVLKLSDAGITPLFLTRQE 
DRGLINALQ INNTAVGHALVLPAGRDLTDFLENVLTCHVCLDI C 
NIDPSCGFGSWRPSFRDRRAAGSDVDIDMAFILDSAETTTLFQF 
NEMKKYIAYLVRQLDMS PDPKASQHFARVAWQHAPS ES VDNAS 
Mfi'VKVE rSLTDYGS KEKLVD FLS RGMTQLQGTRALGS AI E YT I 
ENVFESAPNPRDLKIWLMLTGEVPEQQLEEAQRVILQAKCKGY 
FFWLGIGRKVNIKEVYTFA2BPNDVFFKLVDKSTELNEEPLMR 
FGRLL PS FVS S ENAF YLS PDIRKQCD W FQGDQ PTKNIiVKFGH KQ 
VNVPNNVTSSPTSNPVTTTKPVTTTKPVTTTTKPVTTTTKPVTI 
I NQPS VKPAAAK PA PAKP VAAKP VATKTATVRPP VA VK P ATAAK 
P VAAKPAAVR P PAAAAAKP VATKP EVPRPQAAKPAATK PATTKP 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
liVLKQNtiT VTD RV IGGL LAGQTYHVAWC YLRS QVRATYHGS FS 

TKKSQPPPPQPARSASSSTINLMVSTEPLALTETDICKLPKDEG 

TCRDFILKWYYnPMTlf QHliT3TrMvnnr , /^/^?w7T?x»iyo/^orttrr-i/-ti-.»^. 

* x utvirr x i urn x r\o wiKr VY I ^V>L.V30lNKNKFGSQKECEKVCA 

PVLAJCPGVISVMGT 


7021 


2 


33B 


VNAVS FFPNGYAFATGSDDATCRLFDLRADQELLLYSHDNI ICG 
ITSVAFSKSGRLLLAGYDDFNCNVWDTLKGDRAGVLAGHDNRVS 
CLG VTDDGMAVATGS WDS FLRIWN 


7022 


2 


856 


VYIGSFWSHPLLIPDNRKLFEAEEQDLFRDIQSLPRNAALRKLN 
DLIKRARLAKVHAYlTSSLK'CT!MPQ\7Pn^nMirvw-cT tmiui >n T v 

GRIEREHQISPGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHD I AQLMVLVRQE E SQRP I QM VKGGAFEGTLHG P FGHQ YG 
EGAGEGIDDAEWWARDKPMYDEIFYTLSPVDGKITGANAKKEM 
VRSKLPNSViyjKIWKLADIDKDGMLDDDEFAljANHLIKVKLEGH 
ELPNE LP AHLLP PS KRKVAE 


7023 


2 


748 


AMVFGGWPYVPQYRDIRRTQNADGFSTYVCLVLLVANILRILF 
WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADSKDEEVKVAPRRSFLDFDPHHFMQWSSFSDYVQCVLAFTG 
VAGYIlTLSIDSALFVBTLGFiyVVLTEAMr^PQLYl^PJlQST 
EGMS I KMVI^TSGDAFKTAYFLLKGAPLQFSVOSLLQVLVDLA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVLSSGEQIiQMPVKPERGLGPSDGWLV " 
SSRRGS PGTVLGLPFWLLTP VLVSRS IRSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RLLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLM Y FG FTHCPD I C PDELE KL VQV 
TRQLEAEPGLPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ocyaiciiL containing signal peptide 
(A=Alanine, C=Cysteine, b=Aspartic Acid, E* 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PfeProline, Q-Glutamine R-Ara{nino 
S*Serine, T^Threonine, V^Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








TGSTKQVAQASHS YRVYYNAGPKDEDQDY IVDHS IAIYLLNPDG 
LFTDYYGRSRSAEQI SDSVRRHMAAFRSVLS 


7025 


232 


832 


ERNSPIGNNENL*K\HSLDd J rF»nnMFPMTY^E , AT*t hnMftpo^f — 

KQVIRTCEKRPTFNQHTVFNLHQRLNTGDKLNEFKBLGKAF1SG 
SDHTQHQLIHTS EKFCGDKECGNTFLPDS E VIQYQTVHTVKKTY 
ECKECGKSFSLRSSLTGHKRIHTGEKPFKCKDCX3KAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 " 


32B . " 


1146 


NPNPS IGD 1 KDI KKAAKSMLDPAHKSHFHPVTPSLVFLCFI FDG 
LHQALLSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
oo v uj^^cL»iJijiLjV/u\^5bUKICTSSIjQVQRRFKAMMAS IGRLS 
HGBSADLLISCNAESAIGWISSRPWVGELMFTFLFGDFESPLHK 
LRKS S * LPRKHR*QP INAVRMFLDQCMDGS IALRAI VSEI P VFE 

EKKNNG*KGlGEIF*VWGCTLPPHYWGAVrTNVPKLSNSGKLLG 
QDEQPHIFG 


7027 


43 


954 


GRRLQQQQR pedaedgaegggkrgeagweggypei vkenklfeh 

YYQELKI VPEGEWGQFMDALREPLPATIiR ITGY KSHAKE ILHCL 
KNKYFKELEDLEMDGQKVEVPQPLSWYPEELAWHTNLSRKILRK 
SPHLEKFHQFLVSETESGNISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGSKTTQLI EMLHADMNVPFPEG FVIANDVDNKRC YLLVH 
QAKRLSS PC IM WNHDAS S I PRLQ I DVDGRKE 1 LFYDR I LCDVP 
CSGDGTMRKNIDVWKKWTTLNSLQLHGLQLRIATRGAEQL 


7028 


189 


608 


SRPP PE PE PGTMVEKGSDS SSEKGGVPGT PSTQS LGS UN K I RNS 
KKMQSWYSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 
E1LLQGRLYLSENWICFYSNXFRWETTISIQLKEVTCLKKEKTA 

TfT.TDMBTO 


7029 


1343 


40 


VLSSNTEAKQATGTSSKLRHGTGQEKGREGPRCPSQIAQLRLWG " 
/PC PHAGRETGPRAS AP I PGS * GHGWH W *RKDGRG ERS EG PSAL 

SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA * RS IiPGAAAS ERTEMTKERS P /R PCQG YDS SNW FTQPGK K 
Anrutw *kkxn i m vs> kijGGCIjLYPIjQS I MPE* QLR * GAHAS PPTQG 
R*GKGGPRSPLTKASGTTHrPTPFFGSIP/RPTRDSGPGTDWS\ 
AAPGQKRGHREA*QGPEPV/WGRVTTHLQGPAG*TKPLGS\RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRSTLTFSPQLS I PMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 

PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 


521 


FVCFSAPGSGOfJfiKRPVNMTTT.CaWfTDX/pnnn* t t iranTnvnnu 

EYLVKWKGWSQKYSTWEPEENILDARLLAAFEEREREMELYGPK 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGIR I PYPGRS PQOL 
ASTSRAREGLRN\ RVCPRQRAAPAPAAP \ PRRG P SGPGPRPG * G 
PGLHFPGPGGPSKHGFVPASEQHQHQOHLPRRGPSGPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/ CKPS /RHCDELHEGPSRTAALPCGKPQPKHGVEECG / PCPCLA 
PRRLTEPPALTVSPVGRAAPSGAL*PSGRACSACSHRLAPRAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPSPARSVPP 
LG AQARAAP PRLWC PRALVSG * E AS PEAVS VAAG P P VPG PT PS T 
SGSTASHSRRGC*SPR*TPAPPRRDHGRSAAFEVLTAAASAQPC 
ASQGG PR P TG AGRT PS PLGLP FSRGP PAAS ARPFCRH PS L 


7032 
7033 


1391 
689 


2104 
815 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHLAPL*QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
BPWMKROFGRLHSLFWKSWQKMNSFLLTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH+GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 

RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, DsAspartic Acid, Ee 
Glutamic Acid, F^Phenylalanine. G^Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q^Glutatnine, R^Arginine, 
S=Serine, T=Threonine, V* Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRAIiGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC*SWSA 
WtJWRRPPLQVSPAPSSS CRASCCWCLES IT * S SSTARSRATGAS 
SSSTCPTSRSDRGAAWTP \SPMGAPIiLPCSVP H SREEALQDPR 
NPSP*GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAM F YHAYD SYLENAFPFDELR PLTCDGHDTWG S FS LTL I DALD 
T LL \ TLFY FQ I LGNVS E FQR WEVLQDS VD FD I D VNAS VF ETN I 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMP YGTVNLLHGVNPGETP VTCTAGIGTFI VE FATLS SL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRWYTRFDDW 
YLWVQMYKGTVSM PVFQS LEAYWPGLQS LIGDIDNAMRTFLN YY 
TVWKQFGGLPEFYNI PQGYTVEKREGYPLRPELI ESAMYLYRAT 
GDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESF 
FLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS * PLDNFFI FIFLRLNYNKLLLAI IKK 
K 


7035 


92 


1942 


EDTSSMPFRLLIPLGLLCALLPQHHGAPGPDGSAPDPAHYRERV 
KAM F YHA YDS Y LENAF P FD E LR PLTCDGHDTWGS FS LTL I DALD 
TLL\TLFYFQILGNVSEFQRWEVLQDSVDFD1DVNASVFETNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGM P YGT VNLLHG VN PG ETP VTCTAG I GTF I VE FATLS S L 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
IGAOVDS YFEYLVKGAILLQDKKLMAM FLEYNKA IRNYTRFDDW- 
YLW VQM YKGTVSM P VFQS LE AYW PGLQS L I GD I DNAMRTFLN Y Y 
TVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRAT 
GDP TLLE LGRDAV ES I E K I S KVE CG FAT I KDLRDHKLDNRM ES F 
FLAKTVKYLYLLFDPTNFIHNNGS TFDAVI TP YGECILGAGG YI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREPYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QP FTS KLALLGQ VFLDS S * P LDNFF I F I FLRLNYNKLLLAI IKK 
K 


/ u j b 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN*LPGEGPSI PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN*LPGEGPS I PT 
RNW* ERKAGCSQPC/ PAQQHHGRP PGVS PLPRDPHPTTLRPLP P 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAGAASDMSSGLRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHEISPGHDGTWNDNQ 
uyaiiMyiiK 1 JUiycH Li l JSLiniUUcQJSuAQ \RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQITFTALEGKLRKTTEENQELVTR WMAEKAQEANRiNAR E* KR 
LQEAAS PAAERACRS SKGTSTS RTG 


7039 


155 


891 " 


GAG AASDMS SGLRAADF PRW KRH I S EQ LRR RDRLQRQ A FE E 1 1 L 
QYNKLLEKSDL^SVLAQKI^AEKHDVPNRHEISPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTI S DLETECLDLRTKLCDLERANQTLKDE YDA 
LQI TFTALEGKLRKTTEENQELVTRWMABKAQEANRLNARE * KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYES VMRDS EATGSAS S AQDSTS ENSSSVGGRCRSLKTP KKRSN 
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SEQ 
10 

NO: 


Predicted 

beginning 

nucleotide 

location 

co r r e spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, N=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w-nyptopnan, x«ryrosine, X= unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
_ \=po8sible nucleotide insertion) 








t OoWKHKbl f >U>i>Li>i ^Sl'VKtU'PNSTGVRWVDGPLRSSPRGLG 
EPFEIKVYEIDDVERLQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEALECVTERLESRVNFCKAHLMMITCFDIT 


7041 


1 


567 


S GR VAMGRRRAPAGGS LG RALMRHQTQRS RSHR HTDS WLHT S EL 
NDGYDWGRLNLQSVTEQSSLDDPLATAELAGTEFVAEKLNIKFV 
PAEARTGLLSFEESQRIKKLHEENKQFIiCIPRRPNWNQNTTPEE 

LKQAEKDNFLEWRRQL\VRLEEEQKL1LTPFERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 
"7043 


7 


345 


PIHMAAAALKADI \ISPLFPHIQGYLLLSASHG\ATSLHTKGAXi 
PLETVTMYTV1PKSKYVLVKPDTQYPYSENLDEFKRLAENSASN 
DDLLMAEVAISDYGDKLTLELREKY 




2 


2170 


ARGMAARDSDSEEDLVSYGTGbEPLEEGERPKKPIPLODOTVRD 
EKGRYKRFHGAFSGGFSAGYFNTVGSKEGWTPSTFVSSRQNRAD 
KSVLGPEDFMDEEDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 
OLAAATAPIPGATLLDDLITPAKLSVGFELLRKMGWKEGQGVGP 
RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 
SERAGDLGEIGLNKGRKLGISGQAFGVGALEEEDDDIYATETLS 
KYDTVLKDBEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILDGF 
3 LAS KP LS S KKI Y PP PE LPRD YR PVHYFR PMVAATS ENSH LLQ V 
LSESAGKATPDPGTHSKHQLNASKRAELLGETPIQGSATSVLEF 
LS 0KDKER I KEMKQATDLKAAQLKARSLAQNAQSSRAQPS PAAA 
AGHCSWNMALGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERC LDPS MTE WERGRERDE FARAALLYASS H STLS SR F 
THAKEEDDSDQVEVPRDQENDVGDKQSAVKMKMFGKLTRDTFEW 
HPDKLLFQ/RLVGLPRVKRDKYSVFNFLTLPETASLPTTQASSE 
KVS QH RGPDKSRKPS R WDTSKHE KKEDS I S E FLRLARS KAE PPK 
QQSS PLVNKEEEHAPELSAN 


7044 


276 


734 


EVYLTDEFAXGRKVAPLYELVQYAGNI I PRLYLLITVGWYVKS 
FPQSRKDILKDLVEMCRGVQHPLRGLFLRNYLLOCTRNILPDEG 

EPTDEETTGD1SDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 

ERERDPT'.PTT.T/r'T'MT iro r nrw* 
cinonuCiiinliiVu l Ci Li VrtXioy V 


7045 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVALYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DL E FQLHE P F LLYRNAS LS I YS I W F YD KNDCHR I AKLMAD WE E 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7046 


3 


513 


ajoi? iM'ic.rtjjo*vf\v?wc«i ,, J^x»H/Ujt^yHjJi'X ITSIADLTGQVALYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DLEFQLHEPFLLYRNASLSIYSIWFYDKNDCHRIAKLMADWEE 
ETRRS QQA/RS GQTES Q PGQWLQR PQ AHRH PGDAEQS QG 


7047 
if\A a 


103 


4B6 


QM K I E KCG WS EGLTS I KGNCHN FYTAI S KD VTY KELKNL LNS KN 
IMLIDVREIMEILEYQKIPESINVPLDEVGEALQMNPRDFKEKY 
NEVKPSKSDS/IVFSYLAGVRSKKALDTAISLGFHSYYER 




92 


627 


FFCLTLLSSWUXRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDLAMT YKQRAENTQEELRE FQEGS RE YEAELETQLQQ I ETRN 
RDLLS ENNRLRMELETI KEKFE VQHS EGYRQI SALEDDLAQTKA 

IKDQLQKYIRELEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
EKKW 


7049 
7050 


393 

393 


938 
938 


KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKI PARRCYEDEL ' 
VPVFEAVGRIYELRLMMDFDGKNRGYAFVMYCHKHEAKRAVREL 
NN YE 1 RPGRLLG VCCS VDNCRLFIGG I PKMKKREEI LEE I AKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG ( 

KRTGS AS YGGPPPG LGG PATXAS VAG RCSS VGK1 PARRCYEDEL 
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S£Q 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 

1 Glutamic Acid P— Ph^n«ri *ii i , «^ n -i § 
■ «*uua»uv. j-vv,j.u, r -rnenyiaianine , G=Glycir.e, 

H=Histidine, I«Isoleucine, K=Lysine, 

L=Leucine. M=Methionine, N=Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S=Serine, T»Threonine, VsValine, 

W*Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 

1 Cod on , /^possible nurl *»r>h •» r**» — 
i * f ^f-t. (iULicoLiuc cexetion, 

\ -possible nucleotide insertion) 








1 VPVFEAVGRI YKLiRLMMDFnfiKTJPnva FifMvrnji/up » i/nnimD - ; — 

NNYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREBILEBIAKVT 
1 EG\^DVIVYASAADKMKNRGLBl.PfiVDPDDDr"rniuT /^nw-T -r*nv 
ASSLWG 


7051 
7052 


119 


816 


KKMNLAE I CDNAKKGRE YALLGNYDSSMVY YQGVMQQI QRHCQS 
VR D P AI KG KWQQVRQE LLEE YEQ VKS I VGTLES FKI D KPP DFPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQ1RR/RQSRSKTSEERNGR 
SRS PGTCR PS T\ P I S KS EKFS TS RD KD YRARGRDDKG R KNMQDG 

ASDGEMPKFDGAGYDKDLVEALERDIVSRNPSIHWDDIADLEEA 
| KKLLREAGVLPMWM 


7053 


467 


715 


^ rwvjruno AUbN eh fc.M 1 SRDY Y FDS YAHFG I HEEMLKDE VRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGlLSMFAAPnfiPPP 


7054 


467 


i 715 


S CPGRGKMS KLLNP EE M TS RD Y Y FDS YAH FG I HEEML K D E VRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAARQGPRR 




1 

- 


1036 


GTSQRSRETDARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG 
KKLKWUAMEYDEKLARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFY1CEKSKSVKQTCDKCNTIIWGLIQTWYT 
CTG CYYRCHSKCLNL I S K PCVS S KVS HQAE Y E LN I CP ETGLDSQ 
uiKUitLKAFX/ Cb/DGWPSEARQCDYTGOYYCSHCHWNDLAV 
IPARWHNWDFEPRKVSRCSMRYIALMVSRPVLRLPJEIN 


7055 
7656 


2 


527 


DSRRVSWRSWi^E/WGKHI^LFIWLS^LLFWKTFLf^NQGP" 

EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLIAYLRG 

SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 
M 


7057 ■ 


2 


527 


DSRRVSWRSWLANE/WGKHLCLFIWLSMNVliFWKTFLLYNQGP 
i "- x -"nupiLAj/ AiA-ijb KiU> AS VLNLNCSL I LLPMCRTLLAYLRG 

SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 
M 


" 7058 " 


1368 


431 "j 

r 


wxxiJiiuw^i^i^iiFiciuuKybWUilhWljNLKNHKl^ELLHASCQA | 

SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 

SPQERISEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 

AQKLPTCRECGKTFYRNSQLI FHQRTHTGETYFQCTI CKKAFLR 

SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGIiRHHEKIHTGEKP 

YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 

^Q RSHL GKKPFQ*PVTKLSFPISISQPSHKNTQLHQEELCLR 


7059 


1 


469 


FSGFGAVPDALGCKMSDLRITEAFLYMDYLCFRALCCKGPPPAR 

PEYDLVC IGLTKSRTf T«I T.T.C Yi .fcirc nrvKnnre t»tv* «r» ▼ . »~ 

" v7kjvj iv x o uiji>ivijL.i>iito f UiM v Vij J. I\»FS I KAVPFO 

NAILNVKELGGADN I RKYWSRYYQGSQGVI FVLDSASSEDDLEA 

ARN*SCTQLLQHPQLCTLPFLILA 


7060 


1 


1178 

1 


WPAFPRQPAAAAMDALLGTGPRRARGCLGAAGPTSSGRAARTPA 
APWARPSAWLECVCWTFDLEU3QALELVYPNDFRLTDKEKSSI 
CYLSFPDSHSGCLGDTOFSFRMRQCGGGRSPWHADDRHYNSRAP 
VALQ RE PAH Y FGY VYFRQVKDS S VKRG Y FQ KS LVLVS RLP F VRL 
FQALLSLIAPEYFDKLAPCLBAVCSEIDQWPAPAPGQTLNLPVM 
GVWQVRI PSRVDKSESS PPKQFDQSNLLPAP WLAS VHELDLF 
RC FRP VLTHMQTLWELMLLG EPLLVLAPS PDVSS EMVLALTS CL 
QPLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNWLGVTNPFFIK 
TLQHWPHILRVQEPKMSGDItPKQVJCIjKKPFKV*RPWDTKP 




90 


1670 J 


SVWLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLVliPSQ ' 
if R F EHLVTQM KWRLQEGRG E AVYQ IG VEDNGLLVGLAE EEMRAS 
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SEQ 
10 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

anninft ari e\ 

OIUAilw cl V_j U. l_A 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F<=phenyl alanine. G=Glycine, 
H=Histidine, I=Isoleucine, K=bysine. 
L=Leucine, K=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, +*=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKTLHRRAEKVGADITVLREREVDYDSDMPRKITEVLVRKVPDN 
OQFLDLRVAVLGNVDSG KSTLLG VLTQGELDNGRGRARLNLFRH 
LHE I QSGRTSS I S FE I LGFNS KGE VHG I NGTQWGQTLRMGW * * * 
RT* DGGRVWRLFEI V * MNALRGL*TSS APLRKSMGNQLN* IKNG 
VKI KRQGHPGNGLGPGNS EG VGRAGRRH*GPWALGQVVNYSDSR 
TAEEICBSSSKMITFIDLAGHHKYLKTTIFGLTSYCPDCALLLV 
SANTGIAGTTREHLGLALALKVPFFIWSKIDU^KTTVERTVR 
QIiE RVLKQ PGCHKVPMIjVTSEDDAVTAAQQFAQS PNVTPI FTLS 
S V5GES LDLLKVFLN I L P PLTNS KEQEELMQQLTEFQVDE I YTV 
PEVGTWGGTLSR* I DLLATLPTQPS PI YS KTSW P KGGDPG I 


7061 


Jb4 


710 


ARMP S P LG P P CLP VMD PETTLE E P ETARLRF RGFC YQ E V AG PRE 
ALARLRELCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQR PGS PBEAAALVEGIiQHDP *ARMPS PLG PPCLPVMDPETTL 
EEPETARXRFRGFCYQEVAGPREALARLRELCCQWLQPEAHSKE 
QMLEMLVLEOFLGTLPPEIOAWVRGORPGSPEEAAAIiVEGLQHD 
PGQLLG 


7062 


71 


744 


aka^Nlep^hWlsyffcipkhklkssqkdkvrqfmactqagbr 

TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 

ygrykdpqdenkigvdgxqqfcddlsldpasisvlviawkfraa 
tqcefsrkefldgmtelgcdsmeklkallprleqelkdtakfkd 
fyqftftfaknpgqkgldl*magaywklvlsgrfkfiiylwntfl 

MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR * PELPPDMNSLEQAEDLKAFERR 
LTBYIHCLQ PATGRWRM LL I WS VCTATG AWNWLI DPETQKVS F 
FTSLWNHPFFTI SCITLIGLFFAG IH KRWAPS 1 1 AARCRTVLA 
EYNMSCDDTGKLILKPRPHVQ*QSSLIVMGLKIAFLRISDTAKS 
HKGFLLRLDM 


7064 


300 


864 


RDTGSDPSSTRRLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
SRRTGCWTCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC 
S SRAGRWLETPGRRRGP PACAAAAGRLRGPAP * AAPPTAS VPAR 
CRC PAARTGAPAAATWLRRRLSGLRAPALGRRRS PGPS PKSAAP 
PLLTPLGAGRAGGSRANS 


706S - 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKR5EAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLERIGKGSFGEVFKGIDNRTQQWAIKIIDLEEABDEIEDI 
QQE I TVLSQCDS S YVTKYYGS YLKGS KLWI IMEYLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASLRSNVRAATMMQ1CDT 
YNQKHSLFNAMNRFIGAVNNMDQTVMVPSLLRDVPLADPGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KENITMATEIGSPPRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAGYYNDLVPPIGMLNNPMNAVTTKFVRTSTNKVKCPVFWRW 
TPBGRRLVTGAS SG EFTLWNGLTFN FET I LQAHDS P VRAMTWS H 
NDMWMLTADHGG YVKYWQSNMNNVKMFQAHKEAI REARFIHNI P 
FS WP I VMVKLFSKC I LGAEMHGLCQ FIX3NFLHP I NTI FFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEYVUiLFLALCSAKP FFS PSH I ALiKNMMLKDMEDa'DDDDD 
DDDDDDDDDDEDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCSDLGLTSVPTNIPFDTRMLDLQNNKIKEIKENDFKGLTSLY 
GL ILNNNKLTKI HP KAFLTTKKLRRL YLS HNQL S E I P LNL P KS L 
AELRIHENKVKKIQKDTFKKK 


7069 


1147 


1765 

— 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETLAXQ 
TLKDKTGTDSNSTESSETSTGSLCKESFSGQVSSSSLMPLTPFW 
TLLQSNVPVLOPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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beginning 
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amino acid 
residue of 
amino acid 
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Predicted. end 
nucleotide 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid seamen t containina eianal — _ ~ i_ --3 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=* Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EKTKKGRKDKAKKSKTKMPSbVKXWQSIQRELDEEDNSSSSBED 
RVS TAQKR I E E W KQQQLVSGMAE RNANFEA 


7070 
T'071 


1 


547 


DGTMEDSEAVQRATALIEQRiiAQEKENRKl.PnnapnyT pimnT t u — 
LEDE KHHGAQS AALQ KVKGQER VR KTSLOLR R E 1 1 DVGG IQNLI 
ELRKKRKQKKRDALAASHEPPPEPEEITGPVDEETFLKAAVEGK 
. MKVI E KFLADGGSAD TC DQ FRRTALHRASLEGHME I LE KLLDNG 
ATVDFQ 


7072 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTELNSVPQKS S PFLTRVPAY PPHS EN IQY 
i. wurftiyirrL v f\ji[j 1 r ffftr I VFAGVAPCVPRFVRSNNV 
PBSSI.PPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGM YAPVYDSRRI WRPPMYQRDDI I RSNSLP PMDVMHS S VYQT 

SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDOWAOYHTOK'APT.VQQTT.m/aTrvOrvrnnr ttt vm^.r.A.n 




| 2 


921 


ARGTLRALETAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTELNSVPOKSSPFLTRVPAYPPHSENIQY 
FQD PRTQIP FBVPQYPQTG Y Y P P P PT VP AG VAPCVPRF VRS NNV 
PESSIiPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVpPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 

SLRERYNSIiDGYY^VAfYlDDCFDDTTTrDT nnrnnntfr uinnfififiH 

Mu^uMiwuiAjx iovn<«yi'faat'Ki 1 vFL»t J i<fc,pcQHL»KTSCEEQ 
IRRKPDQWAQYHTQKAPLVS STLPVATQS PTPPSTLNRGEGS 


7073 


50 


S04 


LAHGSFGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMP 
LVRYRKWI LGYRCVGKTSLAHQFVEGEFSEGYDPTVENTYS KI 
**jv**w/eic nunu vu irtvj^iJi^ i o JL LiF I £3 r 1 IvjVHGYVLVYS VTSL 
HSFQVI ESLYQKLHEGHGK 


7074 


263 


1003 


VCPVLCSTRQEPGHSSIiVTYFGKPTRRKEFLLGHClAAGKMNIS 
VDLETNYAELVLDVGRVTLGENSRKKMKDCKLRKKQNERVSRAM 
CALLNSGGG VI KAE I ENED YS YTKDG IGLDLENS FSNI LLFVPE 

YLDFMQNGNYFLIFVKSWSLNTSGLRITTLSSNLYKRDITSAKV 

i uminniJDr uraji KuxCu I JjKPfcLiIiAKRPRVDI QEENNMKAI* 
AGVFFDRTE LDRKE KLTFTE5 THVEI 


7075 


598 


1005 


N ¥ IWFFKRKEypPHVQKVbiWPyRbSKLVGVKRIMKKTEESESQ 

VEPEIKRKVQQKRHCSTYQPTPPLSPASKKCLTHLEDLQRNCRQ 

AITLNESTGPLLRTSIHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7076 


2 79 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWSKGRKRKKPLRDSNAPK 

SPLTGYVRFMNERREQLRAKRPEVPFPEITRMLGNEWSKLPPEE 

KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
RQDAAROATHDHEKETEVKFR <?VFnT pt ftffft mucva no* rr 

RQLRKSNME FEERNAALQ KH VES MRTAVE KLEVDV IQ ERSRN TV 

LQQHLETLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 


3 


1119 


SSMGSNSEINGLALRKTDKYGFLOc;^nY<;rQT kqcit DTmuaDKn — 
EIjKWLDMFSNWDKWLSRRFQKVKLRCRKGIPSSLRAKAWQYLSN 

skelleqnprkfeelerapgdp'kwldviekdlhrqfpfhemfaa 
rgghgqqdlyrilkaytiyrpdegycqaoapvaavllmhmpaeq 
afwclvqicdkylpgyysagleaiqldgbiffallrrasplahr 
hlrrqridpvlymtewfmcifartlpwasvlrvwdmffcegvki 
ifrvalvxlrhti^sveklrscqgmyetmeqlrnlpqqckqedf 
lvhevtnlpvtealierenaaqlkkwretrgelqyrpsrrlhgs 
raiheerrrqqpplgpsss 


7078 


483 


767 


fqgqrmageqkpssnlleqf^llakgtsgsaltalisqvleapg 
vywgelleijvnvqelaeganaaylqllnlfaygtypdyianke 

SLPELY 


7079 


2 


37^ 


swefkrpkepsgsdgesdgpidvgqegqlsqmarplstpsssq 
mqarkkrrgiiekrrrdrinsslselrrlvptafekqgssklek 
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amino acid 
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nucleotide 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=>Cyeteine, D=Aspartic Acid, Bs 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine. 
L=Leucine, M=Methionine, N=Asparagine, 
k= proline, Q=Glutamine, R=Argmme, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








AEVLQM7VDHLKMLHATGGTGTHALLFQASFIQQIF 


7080 


200 


595 


VQLPLEAPCLSLLSCRDHSGGNRDLSRRHRDCRVYGSPQDGIPY 
LTHPIX^ODVVSVGRLQIRALATPGHTOGHLVYLLDGEPYKGPS 
CLFSGDLLFLSGCGEFPRKREELGEEGETEVRAATVPWRALKP ' 


7081 ~ 




506 


A VTEEEM I LNS LS LC YHNKL ILAPMVR VGTLPMRLLALDYGAD I ~ 

VYCEELIDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 

RWFQMGTS 


7082 


3 


1137 


APSRNTMLMAWCRGPVIiLCLRQGLGTNSFIiHGLGQEPFEGARSL 
CC RSS PRDLRDGEREH E AAQRKAPGAES C PS LPLS I S D IGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATEVEERHVSPSCSTSRERPFQAGELILAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFGKI VGKFPGQ I L.RS S FGKQ YMLRRP 
ALEDYWLMKRGTAITFPKDINMILSMMDINPGDTVLEAGSGSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDLAKKNYKHWRDSWKLSH 
VEEWPDNVDFIHKDISGATEDIKSLTFDAVALDMLNPHVTLPVF 
YPHLKHGGVCPVYWN I TQVI ELLD 


7083 


115 


541 


RSNAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTQQLLSEP 
S P KAPRARPCR VS TADR S VRKG I MAYSLEDLLLKVRDTLMtiADK 
PFFLVDEEDQTTVETEEYFQALAGDTVFMVLQKGQKWQPPSEQG 
TRHPLSLSHK 


7084 


3 


522 


NSVSVSSQSRFLASVPGTGVQRSAAADMAASTAAGKQRIPKVAK 
VKNXAP AE VQ I TAEQLLRE AK EK ELELLP PP PQQK I TDE EE LND 
YKLRKRKTFEDNI RKNRTV ISNW I KYAQWEESLKEIQRARS I YE 
RALDVDYRNITLWLKYAEMEMKNRQVNHARNIWDRAITTL 


7085 


243 


1499 


RQLARLRRRG WRS P FGG APMAH I T I NQ YLQQVYEAI DSRDGAS C 
ABLVS FKHPHVANPRLQMASPEEKCQQVLEPPYDEMFAAHLRCT 
YAVGNHDFI EAYKCQTVI VQS FLRAFQAHKEENWALPVMYAVAL 
DLRV FANNADQQL VKKG KS KVGDMLEKAAELLMSC FR VCAS DTR 
AGIEDSKKWGMLFLVNQLFKIYFXINKLHLCKPblRAIDSSNLK 
DDYS TAQR VTYKY YVGRKAMFDSDFKQAEE YLS FAFEHCHRSSQ 
KNKRMILIYLLPVKMLLGHMPTVELLKKYHLMQFAEVTRAVSEG 
NLLLLHEALAKHEAFFIRCGI FLI LEKLKI I TYRNLFKKVYLLL 
KTHQLSLDAFLVALKFMQVEDVDIDEVQCILANLIYMGHVKGYI 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


ILAARMGKQNSKLRPEVMQDLLESTDFTEHEIQEWYKGFLRDCP ' 
SGHLSMEEFKKI YGNFFP YGDAS KFAEHVFRTFDANGDGT I DFR 
EF 


7087 


156 


723 


LSGSSAGKVAAPCVPPSNHELVPITTENAPKNWDKGEGASRGG 
NTRKSLEDNGSTRVTPSVQPHLQPIRNMSVSRTMEDSCELDLVY 
VTER 1 1 AVS FPS TANEENFRSNIiRE VAQMLKS KHGGN YLL FNLS 
ERRPDITKLHAKVLEFGWPDLHTPALEKICSICKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAAS PSSLLEMAGE I TETGELYSS Y VGLVYMFNLI VGTGALT " 
MPKAFATAGWTiVQT.Vr.T.VPT nPMCTTMTTTEn/TVAMTv a 7\M7\ r>T itm 

KRMENLKEEE DD DS S TASDS D VLI RDN YERAEKRP ILS VQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLFYFCIIVYLYGDLAIYA 
AAVPFS LMQVTCSATGNDSCGVEADTKYNDTDRCWG PLRRVD 


7089 


33 


1775 


S VC WEDRYLKARMEES P LSRAPSRGGVNFLNVART Y I PNTKVECT" 
HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
S P IHTS VQ FQAS Y LPKPG AQLYQ FR YVN RQGQ VCGQS P P FQFRE 
PRPMDELVTLEEADGGSDILLWPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERAIiATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKEVELD 
RLRDTVKALTREQEKLLGQLKEVQADKEOSEAELQVAQQENHHL 
NLDLKE AKS WQE EQS AQAQR LKDKVAQM fCDTLGQ AQQRVAE LE P 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A* Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, KMYIethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TeThreonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, • 
\apossible nucleotide insertion) 








LKEQLRGAQE LAASSQQKATLLGEELASAAAARDRT I AELHRSR 
LBVAEVNGKLAELGLHXKBEKCQWS KERAGLLQS VEAE KDKI LK 
LS AE I LRLE KA VQEERTQNQV F KTELAREKDS S LVQLS ES KR E G 
TELRSAIiRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLSCPAALTDSEDSSPEDMRIiHPMAFVSVETQ 
ASLLLGLE 


7090 


33 


1775 


S VCWEDR YLKARMEES PLS RAPSRGG VNFLNVARTYI PNTKVEC 
HYTLP PGTMPSAS DW IG I FKVEAACVRDYHTFVWSSVPESTTDG 
SPIhTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRB 
PRPMDELVTLEEADGGSDILLVVPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGE I TEERD I LS R QQGDHVAR I LELEDDI QT I S E KVLTKE VELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 
NLDLKEAKSWQEEQSAOAQRLKDKVAQMKDTLGQAQQRVAELEP 
LKBQLRGAQELAASSQQKATLLGEELASAAAARDRTIAELHRSR 
LEVAEVrroKLAELGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 
LS AE I LRLE KAVQ EE RTQNQV FKTE LARE KD S S LVQLS E S KREL 
TELRS ALRVLQ KE KEQLQE EKQE LLE YMRKLEARLEKVADE KWN 
EDATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7091 


186 


1076 


EGMLTREHRCGRSEEQELEPWPSPKKARSGRWLRNGFKRKMEEP 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSlilEAY 
ALHKQMR I VKP KVASMEEMATFHTDAYLQHLQKVSQEGDDDHPD 
SI EYGLGYDCPATEG I FDYAAAI GGATITAAQCLI DGNCKVA1N 
WSGG WHHAKXDEASG FCYLNDAVLG ILRLRRKFERILY VDLDLH 
HGDGVEDAPSFTSICVMTVSLHKFSPGFFPGTGDVSDVGLGKGRY 
YS VNVPIQDG IQDEKYYQI CER YEPPAPNPGL 


7092 


522 


809 


KQGINEDQEESQKPRLGEGCEPISKRQMKKLIKQKQWEEQRELR 
KQKRKEKRKRKKLERQCQMEPNSDGHDRKRVRRDWHSTLRLII 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQASMVRMSFVIAACQLVLGLLMTSLTES5IQNS 
ECPQLCVCEIRPWFTPQSTYREA 


7094 


2 


508 


FVRSMHWGVGFASSRPCWDLSWNQSISPFGWWAGSEEPFSFYG " 
DI IAFPLQDYGGIMAGLGSDPWWKKTLYLTGGALLAAAAYLLHE 
LLV I R KQQE IDS KDAI I LHQFAR PNNG VPSLS P FCLKME T YLRM 
ADLPYQNYFGGKLSAQGKMPWI EYNHEKVSGTEFI I 


7095 


1 


411 


iasslpkmasllqsdrvlylvqgekkvraplsqlyfcrVcselr 
slecvshevdshycpsclenmpsaeaklkknrcancfdcpgcmh 
tls trats i s tql pdd pakttm k kay ylacg fcrwts rdvgmad 

KSVGE 


7096 


224 


2067 


ETRSLAVQEKPS QAGRRRSS RI S FAGALFLTRFLLQELLLNN FC 
SAMSPAPDAAPAPASISLFDLSADAPVFQGLSLVSHAPGEALAR 
APRTSCSGSGERESPBRKLLOXjPMDISEKLFCSTCDQTFQNHQE 
ORE H Y KLDWHRFNLKQRLKDKPliLSALDFEKQS STGDLSS I SGS 
EDSDSASEEDIiOTIJDREftATFFKr,QRPPRFVDWPVT.PY^Nartf2ni? 

LYAYRCVLGPHQDPPEEAELLLQNLQSKGPRDCWLMAAAGHFA 
GAI FQGREVVTHKTFHRYTVRAKRGTAQGLRDARGGPSHSAGAN' 
LRRYNEATLYKDVRDLLAGPSWAKALEEAGTILliRAPRSGRSLF 
FGGKGAPLQRGDPPXWDIPIiATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRiiHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGQ 
NEBSPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSOAVAAPLGPL 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPS PADPRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSWRLLLEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELOTBRISS 
FLAAQGDQAFHSGLETNNSNSELPLRVGLKVAQGS PLMGGQVSA 
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amino acid 
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Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D-Aspartic Acid, E«= 
Glutamic Acid, F^Phenyl alanine, G=Glycine, 
H=Histidine, I=*Isoleucine, K=Lysine, 
L= Leucine . M=Methionine N-acn^vi^^o 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








SNSFSRIiHCRNANEDWMSAUTPRLWDVPLHHLSlPGSHbTMTYC 

LNKKSPISHEESRXiLOTiTiMKALPrTTPmnrT vucirmTtT 

**** i^<~Ljij\jijxjx\ i\nj-itr\* x inrv VJjlvno v IQALiDV X fciQ 

LDAG VR YLDLRIAHMLEGS EKNLH FVHMVYTTALVEDTLTE I SB 
WLERHPREWIbACRNFEGLSEDLHEYLVACIKNIFGDMLCPRG 
EVPTLRQLWSRGGXJVIVSYEDESSLRRHHELWPGVPYWWGNRVK 
TEALIRYLETMKSCGR 


7098. 


82 


956 


SSFLKRCRKVLGCWGIPSEQSLFSTLEEPRDKEIDNYCVMRLQT 
i*nxv.^vjr nrtrwitr *■ vw iv.Ki*jiAviA»yKGGSSRETCRC!HFHPSLiEA 
LVLIiLQDWQPGGVGICTSFLGISWALLDYHRALRTCLPSKPLLG 
LGSSVIYFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
LLW VWLQGTDFM PDPS S E WLYR VTVAT I L YFS W FNVAEG RTRGR 

AIIHFAFLLSDSlLtiVATWVTHCCMT.DCr'TTJTJAT UT mr/'^nnnn 

LGLALRLVYYHWLHPSCCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRSIiARQGYHQIWAFPFLPSGATATWPAASRSRSLA" 
ARSLPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSEEPGPGAD 

%jrivjj&vnvi'v/lVJrt\3VoJjfc i ValL»MARL\jAEVILSDSS ELPHCLEVC 

RQSCQMNNLPHLQWGLT WGHIS WDLLALPPQDI ILASDVFFEP 
ED FED I LAT I Y FLMH KNP KVQLrWSTYQ VRS ADWS L EALL YKWDM 
KCVH I PLES FDADKED I AES TLPGRHTVEMLVI S FAKDS I> 


7100 


205 


671 


ni1uul uauivajLii'JLiWvt' iAohbK.™rALGlGSAPPPHLSVL 
FLFS FP PQLGDPLE AFP VFKKYDRNGLNVS I ECKRVSGLE PATV 
DWAFDLTKTNMQTMYEQSE WGWKDRE KREEMTDDRAW YL I AWEN 
SS VPVAFSHFR FDVERGDEVLYW 


7101 


2 


503 


WRGGPRRAKR1JU5GAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEIfiQANVICIVYAVNNKHSIDK 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVBYSR 


7102 


2 


503 


WRGG PRRAKRLAGG AVG WVXiLVRGVHS VRAGGGR P PRAADM JCKD 
VR I LLVGE PR VG KTSLI MS L VSEEFPE EVPPRAEE I TI PAD VTP 
ERVPTHIVDYSEAEOSDPOT.HDP'T Qnnfn/Tr'TxrvAxrvTKTtru r» ttmt 
VTSRW I PL I NERTD KDSRL P LI LGGNKS DLVE YS R 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSSSES 
USDKGSELKKSFDAWFDVLKVTPEEYAGQITLMDVPVFKAIQP 
DE LS S CGWNKKE KYS SAP 


7104 


1670 


795 " 


jivo v urrv7x*«j\jim\jjjoo rV^C-ljljijnr^o JjPoti ciK. V U X Lil NNAGV 
MRCPHWTTEDGFEMQFGVNHLGEAWAGAAPWVQAILPRRPPKVL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGH I DFDDLNWQTRKYNTKAAYCQS \ KLAI VLFTKELSRRLQ 
GSGVTVNALHPGVARTEI/3RHTGIHGSTFLQHHN\WAJILLAAWS 
KSPRSWPAPAQHNTLAVAEELA\VISGKYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREOPLPR 


7105 


765 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGIiQDPQCLALFRVAVDKHQA 
LLKAAMSGQGVDRHLFALYIVSRFLHLQSPFLTQVHSEQWQLST 
SQ I PVQQMHLFDVHN YPDYVS SGGGFGPADDHGYG VS Y I FMGDG 
MITFHISSKKSSTKTDSHRLGQHIEDALLDVASLFQAGQHFKRR 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7106 
7107 


14 
1145 


1064 - " 
591 


GLQAGHPHPRSASRIPEADTH\YSKLQRAFDSIVNKDHKRNFGT 
YFR VGFFGS KFG DLDEQE FVY KE PA I TKLPE I SHR LEAFYGQC F 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNTVLTTMHAF 
PYIKTRISVIQKEEFVLTPIEVAIEDMKKKTLQLAVAXNQEPPD 
AKMLQM VLQGS VGATVNQGP LE VAQV FLAE I PADP KIj YRHHNKL 
RLCFIC3 FIMRCGEAVEKNKRLI TADQRE YQQELKKNYNKLKENL 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCBTQLSQGS 
*I*WLQTGKKK 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P° Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine. M=Methionine, N=Asparagine, 
P=Proline / G=Glutamine, R=Arginine, 
S=Serine # T=Threonine, V=Valine 
W=Tryptophan, Y= Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion,. 
\=possible nucleotide insertion) 


710B 


1 


942 


VKVALLLTNLBQPRTESEWENSPTLKMFLFQFVNLNSST FYIAP 
FLGRFTGHPGAYLRL1NRWRLEECHPSGCLIDLCMQMGIIMVLK 
QTWNNFMELGyPLIQNWWTRRKVRQEHGPERKISFPQWEKDYNL 
QPMNAYGLFDEYLEMILOFGFTTIFVAAPPIJVPT Tan mmttt?t 

RLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGILSVITNAFVI 
AITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRIS 
DFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSIiVPYGYTLQF 
WHVIAW 


7109 


964 


102 


" "V ftluu ' *j u v e u tr t\n vj rivj & E* f n r, is, JVC o JjU«AQEAIjSIQIjQPKE 
TQPFPKSEQVYLHFLSWTEDGPEPKDKGSLPQPPITEVESQVF 
SEKIATDTSTFEATSEGTLELQQRNP KAERLRWS PAQEESFRQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGEKPYKC 

QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 
KAYGWCSELIRHRRVHARKEPSH 


7110 


96 


697 


lujisur avar jjve, v i ivbbKlll VKPiiXDRYRLVKQMLTRASITPVLG 
SPSTKRRGQMLQP I IEGETAHFFEEI KEEEEDGVNLSSELGDML 
KTAVQVQSSLKKSESDVEENQEKLALDLRLSSSRAASMPELLEQ 
LW KARAE KKKLR KTLRE FE E AFYQQNGRNAQKEDR VP VL E E YRE 
YKKI KAIORLLEVIiISKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALELNELTAE 
LKRSLPSTDTRLRPDQRYLEEGNIQAAEAQKRRIEQLQRDRRKV 
ricowiM x vnvj/iKr r KKyM UobuKJbWVfVTNNTYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 


prcfpvadrgrligglpdwtimegktLnltctvfgnpdpeviw 

FKNDQDIQLSEHFSVKVEQAKYVSMTIKGVTSEDSGKYSINIKN 
KYGGEKIDVTVSVYKHGEKIPDMAPPQQAKPKLIPASASAAGQ 


7113 


1 


824 


kclrqawheapsslaftrwcsreeraegggnlhrsitrdpkppg 
lrpsqrpmddkkkkrspkpclaqpaqapgtlrrvpvptshsgsl. 
alglphlpspkqrakfkrvgkekgrpvlagggsgsagtplqhsf 

IiTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKIiARLHFSLDVCGEEEDDEEEEIXSVTEGIjpEEQKKTMADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPLNLPRR 


7114 


3 


1492 


vnbvuD^iLfni ftCioyuRr iiWyH/w -LujvCi IIjIvUBSGQECKICRKI 

iylntdfvsvkqrlpkyyswercskhhlnflgqnrsyvrkkddg 
ckaywkvclhynlhkaqpaerffdpnqrgkalhqkqalrksqrs 
qtoeklykctecgkvfiqkanlwhqrthtcekpyeccecakaf 
sqkstliahqrthtgekpyecseogktfiqkstlikhqrthtge 
kpfvcdkcpkapkssyhlirhekthirqafykgikcttssliyq 
rihtsekpqcsehgkasdekpsptkhwrthtkeniyecskcgks 
frgkshlsvhqrihtgekpyecsicgktfsgkshlsvhhrthtg 

EKPYECRRCGKAFGEK^TT«TVHflRMHTRFK'PYKr T NR'rY2 ira i?c z?v 
SPL1KHQRIHTGERPYECTDCKKAFSRKSTLIKHQRIHTGEKPY 
KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 
KHQRSHTGDKNL 


7115 


1 


947 


NAAHG YNWGLW CM Y 1 1 PPQDWLDRGDE SAP I RT PAM I GCS FWD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPGVDFGDVS ERLALRQRLKCRS FKWYLENVYPEMRVYNNTLT 
YG EVRNS KAS AYCLDQGAEDGDRAI L Y PCHGMS SQLVR YS ADGL 
LQLGPLGSTAFLPDSKCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSGP I VS RATGRCLE VEMSKDANFGLRLWQRCSGQKWM IRN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT 
PGS VINNLS INTVREVDHLRDRNSGSS S S LNTTLPSTSAWSS IR 
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(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
w-irypcopnan, i =iyrosine , X= unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYNVPLSSTAQSTSARNSDSKLTWSPGSVTNTSLAHELHKVP 
LPPKNITAPS R PPPGLTGQKPPLS TWDNSPLRIGGGWGNSDAR Y 
TPGSSWGBSSSGRITNWLVLKNLTPQIDGSTIiRTLCMQHGPLIT 
FHLNLPHGNALVRYSSKEEWKAOKSLHISDLFLLTL 


7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKAt»PAPHMPCAPGALPQGA 
FVSQAARAI PLLQPSQAAQAEGLSQPARACGALCSLPWPIjRNWG 
S P I LRLPGG LRTPTNDR KTRTRS AMACW ARAQWDTLG P LKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7110 


49 


1863 


PHCE PN PGAG AM VLLHVLFEHAVG YALLALKE VEE I S LLQPQVE 
ESVLNLGKFHS I VR LVA FC PFASS Q VALE NANAVSEGWHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMI IQS ISLLDQLDKDINTFSMRVREWYG YHFPELVKI INDNAT 
YCRLAQFIGNRRELNBDiCLEFCLEEljTMDGAKAKAILDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALIGEAVGARLIAHAGSLTNLAKYPASTVQILGAEKALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKBAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMS E KP KKK KKQ KPQE VPQENGMED PS I S FS K P KKKKSFS KEE L 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ES VLNLGKFHS I VRLVAFCP FA SS QVA LENANAVSEGWHEDLR 
titiL ETHLPS KKJCKVLLG VGDP K I G AAI QE ELG YNCQTGG VI AE I 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMI IQSISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MD I S AIDL IN I ES FSSRWS LSE YRQS LHTYLRS KMSQVAPS LS 
ALIGEAVGARLIAHAGSLTNLAKYPASTVQILGAEKALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKLREQVBERLS FYETGE I PRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEFCKRIJ\ALALASSENSSSTPEECE 
EMSERPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKBEPVSSGPEEAAGKSSS KKKKK FH KASQED 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFLVTNLE PRFI E PQTANLSWFKDS 
N S TTPL I FVLS PGTDPAADL YKFAEEMKFSKKLSA I S LGQGQG P 
RAEAMMRSSIERGKWVFFQNCHLAPSWMPALERLIEHINPDKVH 
RDFRL WLTS LPS NKF PVS I LQNGS KMTIE PPRGVRANLLKS YS S 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTDGDLRICIS QLKMFLDE YDDI P YKVLKYTAGE INYGGRVTD 
DWDRRCIMNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLHGY 
LSYIKSLPLNDMPEI FGLHDNANITFAONT'TFJxT.T.rcTT Tnr/iDif 

SSSAGSQGREEIVEDVTQNILLKVPEPINLQWVMAKYPVLYEES 
MNTVLVQEVI R YNR LLQ V I TQTLQDLL KALKGL VVMS S QLELMA 
ASLYNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDFLQAWIQDG 
IPAVFWISGFFFPQAFLTGTLQNFARKFVISIDTISFDFKVMFE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVI WLL PTPNRKAQDQDFYLC PI YKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRHW I KRGVAL I CALD Y 


7121 


2 


546 


RPLRPWVLS LGSMVGLMT YGRRQFQSLDTTMRRL I P P FREAS AK 
LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 
SVPLTIAETVASLWPALQELARCGNLACRSDLQVAAKALEMGVF 
GAYFNVL I NLRD I TDEAFKDQ IHHRVS S LLQE AKTQ AALVLDCL 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

Ii=LeUcine M-Mphhi nn-i no M^Hf.«i«»i«« 

*>"-v»v,i.*ic 1 1— i ic uji iuiune , w^Asparagme , 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y=»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
ETRQE — 


7122 


2 


S46 


RPi^PWVLSLGSMVGLMTYGRRQFQSLDTTMRRLIPPPkEASAK 

LTTLVDADAEAFTAYLEAMRLPKNTPEEKDRRTAALQEGLRRAV 

i v Fii rtiAETVASLWPALQELARCGNLACRSDLQVAAKALEMGVP 

GAYFNVLINLRDITDEAFKDQlKHRVSSLLQEAKTQAALVbDCL 
ETRQE 


7123 


1 


1092 


K PAVP EARS AGTS EAGR5GAE EVS CGS VSUDGAAMRLTP RALCS 
AAQAAWRENFPLCGRDVARWPPGHMAKGLKKMQSSLKLVDCIIE 
VKDARI PLSGRNPLFQETLGLKPHLLVLNKMDLADLTBOQKIMQ 
HLEGEGLKNVI FTNCVKDENVKQ 1 2 PMVTELIGRSKRYHRKENL 
EYCIMVIGVPNVGKSSLINSLRRQHLRKGKATRVGGEPGITRAV 
MS KI QVSERPLMFLLDTPGVUAPRI ES VETGLKLALCGTVLDHL 
VGBETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVLKSVAV 
KLGKTQKVKVLTGTGNVNVIQPNYPAAARDFLQTFRRGLLGSVM 
LiU ItU V JjRGH PR V 


7124 


2 


3B2 


IjPLTLLLAAPFAHtiLLPPGHDQSPCWHPGPALSPGTLGPLSWAM 
ANSGLQLLGY F LALGG WVG 1 I ASTALPQWKQSS YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCX5WSCLLHSAVRSEKGFWS 


7125 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE 
FIELRKWbKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMIIS 
LPBSCLLT\RDTVIRSYLGAYITKWKPPPSPLLALCTFLVSEKH 
AGHRSLLEA\YIjEILPKAYTCPVCLEPEWNLLPKSLKAKAEEQ 
RAHVQE F FAS SRDF FSS LQPL FAE AVDS I FS YS ALLWAWCTVNT 
RAVYL\SPGSGNAFU3SRTPVQLAPYLDLLNHSPHVQVKAAFNE 
ETHS Y E I RTTSRWRKHEE VF I C YG PHDNQRL FLE YG F VS VHNPH 
ACVYVSRGWNQLCS 


7126 
■ 7127 


1 


733 


CRDMAAFIVPSPARRCSQKGSLGHLPTQPWLWAAMSPRGQERGT 
SHSQAREPQRPGRffLUSSLQSSPGTLGQAGTASRRRGCMVQRWV 
OVATGRRAVQVPKGALGIiALGETSPGASRGMSGGAGGCWALGWA 
PSPVLPSWLLEGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*rPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA*SCVARPAGSRSSRPAAA 




1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR*IEKRPFXEI*RRIPRIF 
AKQKQI*S*NSQKrGASEIDRGRKEADCSDAPAAARIGAVSVFR 
RSTQEARVS PRSNAKSANLRAVRAD* WEHFVLLFHTPEQFIiAEC 
ICRST**K*WHQLC*PLSSL*TGI,KRKLLL*VLFRI*WLKDCDV 
*FCQKIFATNFCNWQNLIQ*EE*KPVEYSVEN*HIMNLLLPM+L 
CQS S LRDQT I VTWRM * RN YS M FR I NM I S S L* DGS I H I PLKLH FY 
PALIFTLTVPINSCCQRPLPLFAHQSIKTLASSGSPMLACLRFL 
LVKKRAFIHTPRSPGCSV*CKHVLVKDNKNNCVGSEV 


712B 


2 


5228 


GRVDLWTILLGRSALRELSQIEAELNKHWRRLLEGLSYYKPPSP 
SSAEKVKANKDVASPLKELGLRISKFLGLDEEQSVQLLQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LT YFQDE RH PY RVE YADCVDKLEKEL VS KYRQQ FEE L YKTE APT 
WETHGNLMTERQVSRWF VQCLREQSMLLEI I FLYYAYFEMAPSD 
LbVLTKMFKEQGFGSRQTNRHLVDETMDPFVDRIGYFSALILVE 
GMDIESIiHKCALDDRRELHQFAQDGLICQDMDCLMLTFGDIPHH 
APVLLAWALLRHTLNPEETSSWRKIGGTAIQLNVFQYLTRLLQ 
SLASGGNDCTTSTACMCVYGLLS F VLTS LELHTLGNQQDI I DTA 
CEVLADPSLPELFWGTEPTSGLGIILDSVCGMFPHLLSPLU3LL 
PALVSGKS TAKKVYS FLDKMS FYNEL YKHKPHDVISHEDGTLWR 
RQTPKLLYPLGGQTNLRIPQGTVGQVMLDDRAYLVRWEYSYSSW 
TLFTCEIEMLLHWSTADVIQHCQRVKPIIDLVHKVISTDLSIA 
DCLLPI TSRI YMLLQRLTTVIS PPVDVIASCVNCLTVIjAARNPA 
KVWTDLRHTGFLPFVAHPVSSLSQMISAEGMNAGGYGNLLMNSE 
QPQGEYGVTIAFLRLITTLVKGQIiGSTQSQGLVPCVMFVLKEMI, 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V= Valine, 
W«Tryptophan, Y=Tyrosine, X=UnXnown, *=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








PSYHKWRYNSHGVREQIGCLILELIHAILNLCHETDLHSSHTPS 
LQFLCICSLAYTEAGQTVINIMGIGVDTIDMVMAAQPRSDGAEG 
QGQGQLLIKTVKLAFSVTNNVIRLKPPSNWSPLEQALSQHGAH 
GNNLIAVLAXYIYHKHDPALPRLAIQLLKRLATVAPMSVYACLG 
NDAAAIRDAFLTRLQSK\IE\DMRIK\VMIL\EFLTVA\VETQP 
GLI ELFLNLE VKDG\SDGS KE FS LGM W\ SCLHAV/ VWEL I DSQQ 
QDR YWC P P LLHRAAI AFLHALWQ DRRDS AMLVLRTKP KFWENLT 
S PL FGTLS P PS ETSE PS I LETCAL 1 MKI I CLE I YYWKGS LDQ P 
L KDTLKKFS I E KRFA YWSG Y VKS LAVH VAETEG S S CTS LLE YQM 
L VS AWRMLL 1 1 ATTHAD I MHLTDS WRRQL FLD VLDG T KALLL V 
PASVNCLRLGSMKCTLLLILLRQWKRELGSVDEILGPLTEILEG 
VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDI PQ YSQLVLNV 
CETLQEEVIALFT)QTRHSLAIjGSATEDKDSMETDDCSRSRHRDQ 
RDGVCVLGLHLAKELCEVDEDGDS WLQVTRRLP I LPTLLTTLE V 
SLRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGITQSICLPL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 
LR YN F L PE ALD FVG VHQERTLQCLNAVRTVQS LACLE EADHT VG 
F I LQLSN FM KE WHFHL PQLMRDI QVNLG YLCQACTS FLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASEQQALHTVQYGLLKILSKTLAALRHFTPDVCQILLDQSLDLA 
EYNFLFALSFTTPTFDSEVAPSFGTLLATVNVALNMLGELDKKK 
E PLTQAVG LS TQAEGTRTLKS LLMFTMENC F YLLIS QAMR YLRD 
PAVHPRDKQRMKQELSSELSTLLSSLSRYFRRGAPSSPATGVLP 
S PQGKSTSLS KASPESQEPLIQLVQAFVRHMQR 




1 


1054 


FRR FRWRRRLH *AGPASSAGGS PGEAS GTMSGELPPNINI KEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQGIV 
PPGLTENELWRAKYIYDSAFHPDTGEKMILIGRMSAQVPMNMTI 
TGCMMTFYRTTPAVLFWQWINQSFNAWNYTNRSGDAPLTVNEL 
ij i ai v isA i l GAVATALGLNALTKHVS P L I GRFVP FAA VAAANC I 
NIPLMRQRELKVGIPVTDENGNRLGESANAAKQAITQVWSRIL 
iwirumifrr amim i.jj£,iuvAt bivKFPWMSAPIQVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGL 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCS\A^VSQPNKENWCQDHLYNSLGRKG 

iontvayr inKoyoooo VJjXriivoWUblW I PSDVGKQQLIiSIjHRSS 

RCESHQDLLPDIADSHQQGTEKLSDLTLQDSQKWWNRNLPLN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDNVCSGNTLHSLNSPRTPKKPVNSKLGLSPYLTP 
YNDSDKLNDYLWRGPSPNQQNIVQSLREKFQCLSSSSFA 


7131 


805 


573 


AAAEGHIEWKFJblEACKVNPFAKDRWGNIPLDDAVQFNHLEW 
KLIiQDYQDSYTLSETQAEAAAEALSKENLESMV 


7132 
7133 " 


1420 


1087 


IDMLLLSGALVSGPYTLXTTAVSADLGTHKSLKGNAHALSTVTA 
IIDGTGSVGAALGPLLAGLLSPSGWSNVFYMLMFADACALLFLI 
RLIHKELSCPGSATGDQVPFKEQ 




2 


3648 


QQIPGLLPAHGESGDALRKPRLQKPITGHLDDLFFTLYPSLEKF 
EEELLELHVQDHFQEGCGPLDGGALEILERRLRVGVHNGLGFVQ 
RPQWVLVPEMDVALTRSASFSRKWSSSKTSSGSQALVLRSRL 
RL PEMVGH PAFAVI FQLEYVFSSPAGVDGHAAS VTSLSNLACMH 
MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQLAASPRSPTQHCL 
ARPTSQLPHGS QASPAQAQE FPLEAG I SHLEADLS QTSLVLE TS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
E I LDAN KQ PAEAVS ATE P VTFN PQKE ESDCLQSNEMVLQFLAFS 
RVAQDCRGTS WPKTVYFT FQF YRFPPATTPRLQLVQLDEAGQ PS 
SGALTHILVPVSRDGTFDAGSPGFQLRYMVGPGFLKPGERRCFA 
RYLAVQTLQI D VWDGDSLLL IGSAAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D=Aspartic Acid, £= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine # 
L=Leucine. M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, +-stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








LEWATEYEQDNMWSGDMLGFGRVKPIGVHSWKGRLHLTLAN 

VGHPCEQKVRGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 
KHWQAQKIJUOVDSEIAAMIJjT^ 

ijctu^hus VKi^i^ijLjUi^RRGTSVIJVQQSVRTQHLRDLQVIAAYR 
ERTKAESIASLLSLAITTEHTLHATLGVAEFFEFVLKNPHNTQH 
TVTVEIDNPELSVIVDSQEWRDFKGAAGLHTPVEEDMFHLRGSL 
APQLYLRPHETAHVPFKFQSFSAGQLAMVQASPGLSNEKGMDAV 
SPWKSSAVPTKHAKVLFRASGGKPIAVLCLTVELQPHVVDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFLKVASGPSPEIKDFFVI^YSDRWLATPT 
QTWQ VYLHSLQRVDVS CVAGQLTRLS LVLRGTQTVRKVRAFTSH 
*• ua u^iuvtusv* VLiPPRGVQD LHVGVRPLRAG S R FVHLNL VDVD 
CHQLVAS WLVCLCCRQPL I SKAFE I MLAAGEGKGVNKR I TYTNP 

YPSRRTFHLHSDHPELLR FREDS FQVGGGETYTIGLQFAPSQRV 
GEBEILIYINDHEDKNBEAFCVKVIYQ 


7134 


2115 


1111 


GGEGFSYPFHVGLSLGTPLDPHYVLLEVHyUN PTYEEGLIDNSG 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTIPPGMPEFQSEGHCTL 
ECLEEALEAEKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
IAYDDDFDFNFQEFQYLKEEQTILPGDNLITECRYNTKDRAEMT 
WGGLSTRSEMCLSYLLYYPRINLTRCASIPDIMEQLQFIGVKEI 
YRPVTTWPPIIKSPKQYKNLSFMDAMNKFKWTKKEGLSFNKLVL 
SLP VNVRCSKTDNAEWS IQGMTALPPDI ERP YKAE PLVCGTSSS 
SSLHRDFSINLLVCLLLLSCTLSTKSL 


7135 


2 


2072 


FVPRVTPRSLSLG^PKGESVGSITQPIiPSSYLIFRAASESiDGRC~ 

WLDALBLALRCSSLLRLGTCKPGRDGEPGTSPDASPSSLCGLPA 

SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 

KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 

ENKS LM WTLLKQLR PGMDLSR WLPTF VLEFRS FLNKLS D YYYH 

ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG 

ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 

GS I TAKS R FYGNS L S ALLDG KATLTFLNRAEDYTLTM PYAHCKG 

iuxwiniijuijijtaA.v l lciLAKWNFQAQLEFKLKPFFGGSTSINQl 

SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 

QRLRQHTVPLEEQTELESERLWQHVTRAISKGDQHRATQEKFAL 

EBAQRQRARERQESLMPWKPQLFHLDPITQEWHYRYEDHSPWDP 

LKDIAOFEQDGILPTTinOPAVJiDOT'PPT ^ennnnunn^nn^/v^. 
" v;4 - i i-»vvii/\v/VKU 1 1 r LAio ck3 PKHERSGPDQR L 

RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLGALHEAILSIREAQQBLHRHLS^ILSSTARAAQA 
PTPGLLQS PRSWFLLCVFLACQLFINHILK 


7136 


2 


418 


DF VPS FRR PSGNTS QTVWLLRAATLFKFVAril 1 rpvtuut nnMT v — 

S QQR KVRQM I EQLQNS KAV I QS KDAT I QELKE K IA YLEAENLEM 

HDRMEHLIEKQISHGNFSTQARAKTENPGSIRISKPPSPKPMPV 
IRWET 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDDLDGNTNKRS KE VR VLQEMQLLQ VAAMNYRLR PLE KFVT Y FT 
RMEQLSDKESYKLSCQLEPENP 


7138 


7 


466 


WASGMSTVpGusRHSIiGIQVRGGWGVTGGEEESLTVPVADTWQA 
GS FKVATQ ERNPQRAQMRLRRQ KKG WP FLGD FLTE LQRLDS A I 

PDDLDGNTNKRS KEVRVLQEMQLLQVAAMNYTUiRPLEKFVTY FT 
RMEQLSDKESYKLSCQLEPENP 


7139 
7140 " 


1 


357 


SLRNSARGLKMAASAARGAAALRRSINQP VAFVRRI PWTAASSQ 
LKEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGLRNALQ 
QENH 1 1 DGVKVQVHTRRP KL PQTS DDE KKD F 




146l 


1957 


RASSLQ VLKAWGGL I P S S FQQQH TGQ YALEKLFDLKVYDC PCS F 
NWJVSLEKQLRPSQPWPRGKCRKTPGWBEARPKAQDLRGDLGKT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
<A=Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=.Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 

WTPKGQDPPLMFSEDYQKSLLEQYHLGLDQKLRKyWGBLIWNF 
ADFMTNQCG 


7141 
7142 


124 


1073 


UJSRSCWLDMEDLEEDVRFiVDETLDFGGLSPSDSREEEblTVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLSPEKLEEILD 
EANRLAAQLEQCALQDRESAGEGLG PRR VK PS PRRET FVLKDS P 
VRDLLPTVNSLTRSTPS/LKQPDASTPE***EGVSQGSPGYIWK 
EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 
AASEETRAAKLRGAAAKSSCQLPIPSAIPRPASRMPLTSRSVPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7143 


658 


839 


LIFLMLHMEbKMLSSVTLHIRAFLYWICLKPTSCLIFQNVLNLL 
KK*SRAVGWWMCRT/YSSDLQVGVIKPWLLLGSQDAAHDLDT 
L K KNKVTH I LNVAYG VENAFLS D FTYKS I S I LDLPETN I LS YF P 
ECFEFIEEAKRKDGWLVHCNA 


7144 


3 


773 


SliEMSSDGEPLSRMDSEDSlSSTIMDVDSTlSSGRSTPAMMNGQ 
GSTTSSS KN I AYNCCWDQCQACFNSS PDLADHIRS IHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCWGGCNA 
S FASQGGLARHVPTH FSQQNS S KVS SQP KAKEES PS KAGMNKRR 

KLKNKRRRSLARPHDFFDAQTLDAIRHRAICFNLSAHIESLGKG 
no v vr«j> i vjjiJjjbFFOIKYKTLQKNISTIISKSLKI 




1 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
RCPAPR PAG VS YVIRDE VEK YNRNG VNALOLDPALNRLFTAGRI) 
SIIRIWSVNQHKQDPYIASMEHHTDWVNDIVLCCNGKTLISASS 
DTT VKVWNAH KG FCMS TLRTHKD YVKALAYAKDKE LVAS AG LDR 
QIFLWDVNTLTALTASNNTVTTSSLSGNKDSIYSLAMNQLGTII 
VSGSTEKVLRVWDPRTCAKLMKLKGHTDNVKALLLNRDGTQCLS 

GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKI YCTDLRNPD I R VL I CE 



TRADOCS:14I6260.1(%CSK01!.DOC) 
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WHAT IS CLAIMED IS: 
1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NOrl-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l'-1786 and 3573-5358, an active domain of SEQ ID NO:M786 and 
3573-5358, and complementary sequences thereof. 



2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 



604 



WO 01/53312 



PCT/US00/34263 



(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 1 0. 

■13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the . 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

1 5. The method of claim 1 4, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

1 9. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1 786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1 786 and 3573-5358, an active 
domain of SEQ ID NO: 1-1 786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-1 786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 1 0 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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